swestrup | GUI / CLI Integration in Linux

I would like to preface this by saying that I don't believe myself to be a technology bigot. I try to evaluate all technology on the basis of what its good for, and how well it meets the needs of its users. (On that basis, I actually predicted that VHS would beat out Betamax as a tape standard. I've been amazed at how many folks have tried to explain it as a random selection effect with feedback, based on the assumption that Betamax was technically superior. The fact is, Betamax may have had higher image fidelity, but that wasn't what consumers were looking for. They wanted to be able to stick a tape in the VCR and have it tape a movie when they were out. With only a 60-minute tape length Betamax couldn't do that; VHS could. The rest is history.) So, when you see that much of what follows is a severe critique of Linux in its user-interface design, don't you DARE go assuming that makes me a Windows Advocate. That's a false dichotomy. Yes, there are things that Windows gets right that Linux gets wrong. There are things that Linux gets right that Windows gets wrong. There are things that OSX, OS/2, Plan 9, NextStep and BEOS individually get right that all the others get wrong. Pointing out a flaw in one of these is NOT an endorsement of the others. The reason this is a critique of Linux and not Windows is that a) there are LOTS of technically accurate critiques of Windows, while most technical critiques of Linux are baseless FUD and b) Linux can be fixed.

I have long complained that Linux is not ready for the Desktop, although it most certainly IS ready for reliable computing (My firewall, webserver and mailserver all live on Linux machines, but my main personal system still runs Windows). I have used KDE and I have used Gnome, and I have used any number of window managers running on top of them, and I always find it a nightmare.

The simple fact is, I have very particular needs and desires for my windows system. To start with, I'm red-green color blind, so certain ways of presenting information to me simply fail, therefore I must have control over the color space used. Those color space controls must NOT assume I can see colors well, or I can't use them. I like to have a large, uncluttered display, and the size of text used for different purposes should be different. My eyes aren't as good as they used to be, but I like to have a large amount of visible context. My desire for high resolution screens DOES NOT imply that I want small fonts. I hate overlapping windows. I have strong preferences for how window focus should work. I have strong preferences for how mouse clicks should be interpreted. I have strong preferences for how virtual screens / menu bars / control panels should work. I want persistence in window positions, sizes and the settings for column sizes, orders and sub-pane borders. Finally, I don't want to have to learn any new syntaxes or programming languages to get what I want.

My user-interface needs are met by Windows in a barely adequate manner. They simply aren't met by Linux. Now, that's not a ringing endorsement of Windows there. When I first upgraded to XP from 98SE, it took well over a month to find enough hidden settings and tweak utilities to get back an approximation of the work environment that I had set up under 98SE. Its definitely a problem that in Windows those hidden settings and tweak programs were necessary, and that they used undocumented hooks into the windows configuration system. Linux, on the other hand, seems to completely lack such hooks, documented or otherwise, in its various window managers and presentation engines.

Often I find the lack of configuration in Linux to be incredibly puzzling. The last version of a window manager might handle the opening of windows one way. The next version handles it a different way. Neither method was an option. They are unoverridable defaults that seem to match the changing preferences of the person writing the manager. There is no recognition that different people have different preferences, and that just because one person likes the new way better, others might not. In a similar vein, the last time I tried to configure a KDE panel (I generally prefer Gnome as it seems to have better font handling), I was surprised that it could dock to any side of the screen *except* the left hand side, which is where I wanted it. I can't for the life of me imagine what caused THAT design decision.

Most window managers have a configuration window with cryptic buttons that are unexplained anywhere. Often some of them are for envisioned extensions and don't actually DO anything yet. Equally likely is that the actual window configuration is in a config file written in an equally cryptic and equally unexplained language and supports a whole range of undocumented features that show up nowhere on the GUI configurator. When you try a different window manager in frustration you discover that it has a completely different config language and theory of GUI, but that it also has a random set of strange and different configuration options in its config file and its config panel.

As if this didn't cause enough frustration, most of the O/S functionality provided by the GUI is implemented by calling command-line functions to actually perform such tasks as mounting devices or determining processor load. This, in itself, is not a bad thing. In fact, its a very good idea, if done right. Naturally, it isn't done right. The interface between the GUI and the linux CLI tasks is explained nowhere. You cannot find out which programs are called for which purposes, and you usually have no way to configure the calling of a different program with different parameters. The one advantage that you could conceivably get by hard coding the called program in the GUI is that the error status codes returned by the called program can be understood and correct actions taken automatically. This is almost never done. Often, a CLI program will send an explicit description of what went wrong to the standard error stream, but all that the user ever gets to see is a terse 'Operation Failed' pop-up.

Anyway, enough of the negativity. I said at the top of this rant that Linux could be fixed. So, what needs to be done? Well, we need a universal configuration language used by all of the Linux tools. It needs to be flexible and extensible enough to handle everything from X to keyboard drivers to help documentation to Emacs preferences. It need not be human-readable, as we actually want to discourage everyone and sundry from writing and rewriting various broken parsers for it. There needs to be a library of standard tools and OS calls to read, write, parse and update these config files, and to render them in a number of human-readable forms. What's more, we need a mechanism by which a program declares what its configuration options and error responses are (hopefully as part of a simple way to implement the program's UI via library calls) and this declaration is used to generate the help text when a CLI program is called with -? or --help, to generate a config panel under X that displays the current configuration and lets it be modified, and which controls the reading and writing of the universal config files. We also want all of this to be natural language-independant and have a mechanism by which anyone can add translations of textual messages issued by any program.

Whew! Now, all of the above is a very tall order, and its not about to happen overnight. I do expect that it will eventually happen, in one form or another, (and in fact I've been working in my spare time to try and help speed up its arrival), but its a very large amount to do at once. So, a very pragmatic question to ask would be: is there something we can do NOW as a first step to help things along?

I think there is. Now, this idea is not original to me. I first heard of it from

_sps_ during a particular debug session, and it has stuck in my mind ever since. The idea is to design the language by which programs will communicate their configuration and error declarations, and build a suitable subset of it into a set of library routines. We will then modify a few dozen commonly-used command-line tools, like 'ls', so that if you pass them a '-<TAB>' option (thats a dash followed by a bare tab) they will parse their command lines and return a subset of their configuration options compatible with whats on the line. We will also modify a bunch of shells, like BASH, so that when you hit tab after a dash while typeing in a command line, it will silently call the program with the dash-tab option, parse the return results, and let you tab through the suitable options for that spot on the command line. This will give us automagical completion of command-line options that works analogously to the completion of command-line file arguments that we already have. Once this new mechanism gains acceptance (and I believe it will), no one will be surprised if you enhance GUI's so that when you click on a program to run it, they GUI queries the program (and caches the results) to see if its advisable to pop-up a command-line option window that was autogenerated from the returned config info. Once you've filled in the data and run the program, it will be able to compare the returned code with the config information and give you a reasonable error message, even if its never seen the program before, and hopefully even if the language of the returned error isn't one you speak.

This would be the thin edge of the wedge, but it would already go a long way towards taming the chaos that is the Linux user interface.

Flat | Top-Level Comments Only

From:

sps.livejournal.com

Wow. Reading this was a very interesting experience for me, and we've had this conversation.

First of all, I have an immediate Betamax reaction: what does the Windowing system have to do with Linux? That's X, it's a different layer, it comes from MIT, we've had that on Unix boxen since the dark ages. And it is extensively documented - in a huge set of books that you pay for, which is why nobody has a copy. No configurability? What are you talking about? You can write whole entire new window managers, you can't get much more configurable than that. Me, I used to use gwm. I guess it died out, and of course the reason for that was it was configurable. I had areas of the screen that you could mouse over to get different effects. I had a single shared text area at the bottom of the display that was sometimes a console, sometimes a log, sometimes my Emacs minibuffer - all automagically, depending on what I was doing. I had status thingies for everything on the network that appeared in little rows along the sides of the windows. And it was pretty much all my own code. In pervy, event-driven, LISP. It was an absolute HOG - it ran on its own workstation, separate from the one that handled my work - and I loved it. And you know, calssic X applications have more documentation devoted to customisability than to normal operation: huge tables of resources that control fonts, colours, controls, behaviours, external resources....

But the market decided that was Betamax. What the world wants, evidently, is something where you just stick the tape in and record the movie - and it has to be just like Windows: impossible to misconfigure, because any controls there may be are hidden from mortal view in some undocumented place.where nobody will be tempted to poke. That's the direction we're moving in, away from configurability. To 'get ready for the Desktop' (i.e. [win]lusers).

Now, I know that sounds like a long, defensive rant that misses your point, but it isn't really. If everything we've each said is true, then there's a second important question: how can people setting out to do things right defend themselves against anti-technical, anti-configurability bias, against the propensity of newbies (I was there once) to over-configure and shoot themselves in the foot, and, in short, against being - or, in the face of a FUDful environment, being perceived to be - a Betamax?

From:

swestrup

If everything we've each said is true, then there's a second important question: how can people setting out to do things right defend themselves against anti-technical, anti-configurability bias, against the propensity of newbies (I was there once) to over-configure and shoot themselves in the foot, and, in short, against being - or, in the face of a FUDful environment, being perceived to be - a Betamax?

Well that is, in my mind, more a question of perception than anything else, and a whole other huge discussion. Basically I believe, and I've said this before, that things should be coded with as much interoperability and programmability as is possible, but to provide a user interface that caters equally to the technophobic granny and the ubergeek. Now, that may not be easy, but I don't think its impossible, and I have some ideas on how to go about it.

From:

joenotcharles.livejournal.com

The problem is that it's all configurability and no interoperation. The unconfigurable version is no good for people who don't like the defaults, and the ultra-configurable version tends to be useless until you spend days fine-tuning it (or if you want something really off the wall, months writing it). We need a more standard framework we can fit configuration into so that you can tweak parts of your UI without ripple effects.

Look at KDE (BTW, when I used it I had two docks down the left and right sides of the screen, and nothing at the bottom, so I don't know why that didn't work for Stirling). It's probably just as configurable as gwm - you can swap in any NETWM-compliant window manager for KWin, and you can replace all sorts of its service objects. I can easily think of an approach for implementing your shared text area, for instance. But I've only occasionally seen people try to do that type of thing with it, because most people like you said just want to play with the eye candy at best. Really, the entire KDE setup is overkill for what it's used for.

There's so much more that could be done if we take some time to think about the middle layers, rather than focussing all the work toward the ultimate end-user experience (for which you want a sensible default and some non-confusing options) or the fundamentals (for which you want something that does exactly what you tell it, and only that).

From:

joenotcharles.livejournal.com

Anyway, enough of the negativity. I said at the top of this rant that Linux could be fixed. So, what needs to be done? Well, we need a universal configuration language used by all of the Linux tools. It needs to be flexible and extensible enough to handle everything from X to keyboard drivers to help documentation to Emacs preferences.

UniConf!

We will then modify a few dozen commonly-used command-line tools, like 'ls', so that if you pass them a '-' option (thats a dash followed by a bare tab) they will parse their command lines and return a subset of their configuration options compatible with whats on the line.

This reminds me that I'm still annoyed by every cmdline parsing library I've ever used. getopt is obscure, I don't particularly like popt's API, and KConfig is too KDE-specific. My favourite so far is actually from the "suggested Unix extension" to Glk, but don't bother looking for it at that address, cause it's only described in some header files in the Unix implementations.

The dash-tab extension seems like a good thing to put into the mythical perfect cmdline library.

We will also modify a bunch of shells, like BASH, so that when you hit tab after a dash while typeing in a command line, it will silently call the program with the dash-tab option, parse the return results, and let you tab through the suitable options for that spot on the command line.

Actually, I don't think this is necessary. Tab-completion in bash is handled through a programmable config file, so it should be possible to just add an entry... No, that's not true, the "complete" builtin is too limited. I haven't checked all the associated functions yet, but it may well need to be extended. Fooey.

From:

swestrup

> UniConf

That actually looks like a credible attempt. I hope it can handle non-ascii strings for both keys and values though (and possibly pure binary keys and values).

And yeah, I've hated every cmdline parser I've ever used as well. The secret to getting it to be accepted is that even if all you want it to handle are a filename and '-?' to get help, it should be easier to use than to write your own.

Extending the completion built-in in Bash (or perhaps adding another built-in, I've never looked at Bash's source so I don't know which is easier) should theoretically be fairly simple, especially if one has already written the parsing library.

From:

joenotcharles.livejournal.com

You know what? Bash already tab-completes arguments:

ls - gets turned into ls --, and ls -- says:

$ ls --
--all --dereference --full-time
(etc, etc)

It only does the options with full names, though, and AFAIK it doesn't give you a subset compatible with what's already typed. And it'd be nice to get more info to parse than just the bare names.

(This isn't the same as "ls --help" - I don't know if there's an arg I can pass to ls to get the bare arg list from it, or if bash is automatically calling ls --help and then picking out the commands from it already.)

From:

joenotcharles.livejournal.com

find, which takes -longname instead of --longname, gets arg-completed after the first -. dd, which takes arg=val, doesn't get arg-completed at all. complete, a built-in func taking only short opts, doesn't get arg-completed.

Annoyingly, xterm does not get arg-completed. That obviously needs to be fixed.