Sound advice

Watching my mother trying to use Windows XP to locate her holiday snaps makes it clear to me that tagging is the right way to interact with personal documents. The traditional "one file, one location" filesystem is old and busted. The scenareo begins with my mother learning how to take pictures from her camera and put them into foldlers. Unfortunately, my father is still the one managing short movie files. The two users have different mental models for the data. They have different filing systems. Mum wants to find files by date, or by major event. Data thinks that movie files are different to static images and that they should end up in different places. The net result is that Mum needs to learn how to use the search feature in order to find her file, and is lucky to find what she is looking for.

Using tags we would have a largely unstructured collection of files. The operating system would be able to apply tags associated with type automatically, so "mpeg" and "video" might already appear. The operating system might even add tags for time and date. The user might add additional tags such as "21st Birthday" or "Yeppoon Trip". Tags are associated with the files themselves and can be added from anywhere you see the file. You'll then be able to find the file via the additional tag. This approach seems to work better than searching or querying does for non-expert computer users. A query has to be constructed from ideas that aren't in front of the user. Tags are already laid out before them.

Here is one attempt at achieving a tagging model in a UNIX filesystem. I'm not absolutely sure that soft links are the answer. Personally I wonder if we need better hard links. If you delete a hard link to a file from a particular tag set it would disappear from there but stay connected to the filesystem by other links to it. It shouldn't be possible to make these links invalid like it is with symbolic links. Unfortunately hard links can't be used across different filesystems. It would be nice if the operating system itself could manage a half-way point. I understand this would be tricky with the simplest implementation resulting in a copy of the file on each partition. Deciding which was the correct one when they dropped out of sync would be harmful. Perhaps tagging should simply always happen within a single filesystem. Hard links do still have the problem that a rename on one instance of the file doesn't trigger a rename to other tagged instances.

Benjamin

One thing I've noticed as I've gotten into user interface design concepts is that the best user interface is usually not something you see on your computer screen. When you look at an iPod it is clear how to use it, and what it will do. A well designed mobile phone such as the Nokia 6230 my wife carries makes it easy to both make phone calls and to navigate its menus for more sophisticated operations. A gaming console like the PS2, the Xbox, or the Game boy is easy to use. Easer than a PC, by miles.

Joel Spolsky has a wonderful work online descibing how user interfaces should be designed. He designates a whole chapter to "Affordances and Metaphors", which very simply amounts to giving the user hints on when and how to click. In the same document he highlights the problem that makes desktop software so hard to work with generally:

Users can't control the mouse very well.

I've noticed that whenever I'm finding a device easy to use, it is because it has a separate physical control for everything I want to do. Up and down are different buttons, or are different ends of a tactile device. I don't have to imagine that the thing in the screen is a button. Instead, the thing in the screen obviously relates to a real button. I just press it with my stubby fingers and it works.

So maybe we should be thinking just as much about what hardware we might want to give to users as we think about how to make our software work with the hardware they have already. Wouldn't it be nice if instead of a workspace switcher you had four buttons on your keyboard that would switch for you? Wouldn't it be nice if instead of icons on a panel, minimised applications appeared somewhere obvious on your keyboard so you could press a button and make the applications appear? There wouln't be any need for an expose feature. The user would always be able to see what was still running but not visible.

Leon Brooks points to a fascinating keyboard device that includes LCD on each button. This can be controlled by the program with current focus to provide real buttons for what would normally only by "visual buttons". I think this could make the app with focus much more usable both by freeing up screen real estate for things the user actually wants to see and making buttons tangable instead of just hoping they look like something clickable. Personally I would have doubts about the durability of a keyboard like this, but if it could be made without a huge expense and programs could be designed to work with it effectively I think it could take off.

We can see this tactile approach working already with scrollwheel mice and keyboards. It is possible to get a tactile scroll bar onto either device without harming its utility and while making things much simpler to interact with. Ideally, good use of a keyboard with a wider range of functions would remove entirely the need for popup menus and of buttons on the screen. In a way this harks back to keyboard templates like those for wordperfect. I wonder if now the excitement over GUIs and mice has died down that this approach will turn out to be practical after all.

Update 18 October 2005:
United Keys has a keyboard model that is somewhat less radical in design and looks to be a little closer to market. Thanks to commenter Tobi on the Dutch site Usabilityweb for pointing it out. I like the colour LCD promised in the Optimus keyboard, but suspect that the united keys approach of segregating regular typing keys from programmable function keys will wear better. Ultimately it has to both look good and be a reasonable value proposition to attract users.

Benjamin

I've spent most of today hacking python. The target has been to exploit the possiblity of REST in a non-web programming envioronment. I chose to target a glade replacement built from REST principles.

Glade is a useful bare bones tool for constructing widget trees that can be embedded into software as generated source code or read on the fly by libglade. Python interfaces are available for libglade and gtk, but there are a few things I wanted to resolve and improve.

The event handling model, and
The XML schema

The XML schema is bleugh. Awfully verbose and not much use. It is full of elements like "widget" and "child" and "parameter" that don't mean anything, leaving the real semantic stuff as magic constant values in various attribute fields. In my prototype I've collapsed each parameter into an attribute of its parent, and made all elements either widgets or event handling objects.

The real focus of my attention, though, was the event handling. Currently gtk widgets can emit certain notifications, and it is possible to register with the objects to have your own functions called when the associated event goes off. Because a function call is the only way to interact with these signals, you immediately have to start writing code for potentially trivial applications. I wanted to reduce the complexity of this first line of event handling so much that it could be included into the glade file instead. I wanted to reduce the complexity to a set of GET, and PUT operations.

I've defined my only event handler for the prototype application with four fields:

The name of the event to use as a trigger (I use the position of the Event element to determine which object to associate the event with)
The method to use, typically PUT
The url to send the PUT to
The url to get data from as content to the PUT

To make this URL-heavy approach to interface design useful, I put in place a few schemes for internal use in the application. The "literal:" scheme just returns the percent-decoded data of its path component. It can't be PUT to. The "object:" scheme is handled by a resolver that allows objects to register with it by name. Once a parent has been identified, attributes of the parent can be burrowed into by adding path segments.

The prototype example was a simple one. I create a gtk window with a label in it. The label initially reads "Hello, world". On a mouse click, it becomes "Goodbye, world". It's not going to win any awards for oringinality, but it demonstrates a program that can be entirely defined in terms of generic URI-handling and XML loading code and an XML file that contains the program data. An abbreviated version of the XML file I used looks like this:

<gtk.Window id="window1">
        <gtk.Label id="label1" label="literal:Hello,%20world">
                <Event
                        name="button_press_event"
                        url="object:/label1/label"
                        verb="PUT"
                        data="literal:Goodbye,%20world"
                        />
        </gtk.Label>
</gtk.Window>

You can see that the literal URI schem handling is a little clunky, and needs to have some unicode issues thought out. The Event node is triggered by a button_press_event on label1, and sends the literal "Goodbye, world" to lable1's own "label" attribute. You can see some of the potential power emerging already, though. If you changed either the initial label url to data extracte from a database or from http we have immediately developed not a static snapshot in this file but a dynamic beast. The data in our event handler could refer to an object or XML file that had been built up XForms-style via other event interactions. It is possible that even quite sophisticated applications could be written in this way as combinations of queries and updates to a generic store. Only geniunely complicated interactions would need to be modelled in code, and that code would not be clouded by the needless complexity of widget event handling.

It is my theory that if REST works at all, it will work in this kind of programming environment as well as it could on the web. I'm not trying to preclude the use of other languages or methods for controlling the logic of a single window's behaviour, but I am hoping to clearly define the level of complexity you can apply through a glade-like editor. I think it needs to be higher than it is presently to be really productive, but I don't think it should become a software IDE in itself.

All in all my prototype came to 251 lines of python, including whitespace and very minimal comments. That includes code for the event handler (12 lines), a gtk object loader (54 lines, and generic python object actually), and URI resolution classes including http support through urllib2. The object resolver is the largest file, weighing in at 109 lines. It is made slightly more complex by having to deal with pygtk's use of setters and getters as well as the more generic python approaches to attribute assignment.

My example was trivial, but I do think that this kind of approach could be considerably more powerful than that of existing glade. The real test (either of the method or of gtk's design) will probably come in list and tree handling code. This will be especially true where column data comes from a dynamic source such as an sqlite database. I do anticpate some tree problems, where current even handlers often need to pass strange contexts around in order to correctly place popup windows. It may come together, though.

Benajamin

For those who read planet.gnome.org even less frequently than I do[1], Nat Friedman has spotted a very neat little desktop mockup by the name of Symphony.

For those who are into user interface design, this looks quite neat. They've done away with pop-up menus on the desktop, replacing them with a desktop that becomes the menu. I suspect this will greatly improve the ability to navigate such menus. Users won't have to hold the mouse down while they do it. They also use the corners of the desktop as buttons to choose which menu is currently in play. Very neat. I'd like to see it running in practice.

Oooh, their development environment Orchestra sounds interesting, too. I wonder how will it will support REST principles, and how it will isolate the data of one user from that of another user...

Benjamin

[1] Actually, Nat is someone I currenly have a subscription to.

According to the Gnome 2.10 "what's new":

sticky notes now stay on top other windows, so you can't lose them. To hide the notes, simply use the applet's right-click menu.

So now sticky notes have two modes:

In my way, or
Out of sight, out of mind

I do not consider this a feature. If you are using sticky notes with Gnome 2.8 or earlier, I do not recommend upgrading!

Update: I've placed a comment on the existing bug in Gnome bugzilla.

Benjamin

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My recent bookmarks

File Tagging

The Visual Display Unit is not the User Interface

A RESTful non-web User Interface Model

Symphony Operating System

Gnome 2.10 Sticky Notes

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My current feeds

My recent bookmarks

File Tagging

The Visual Display Unit is not the User Interface

A RESTful non-web User Interface Model

Symphony Operating System

Gnome 2.10 Sticky Notes