Sound advice - blog

Tales from the homeworld

My current feeds

Thu, 2005-Jun-02

User Daemons

The UNIX model of interprocess communication comes in two modes. You have your internal IPC, such as pipes and unix sockets. You have your external IPC, which pretty-much boils down to IP-based sockets.

The internal model works well for some kinds of interactions, but unix sockets (the internal ones) have pretty much faded into obscurity. On the other side of the fence IP-based sockets are now universal. So much so, that IP sockets over the loopback interface are commonly used for what is known to be and what will only ever be communication restricted to the single machine.

If IP sockets are so widely accepted, why aren't UNIX sockets?

I think the primary answer is the the lack of a URI scheme for unix sockets. You can also add to that the unlikiness that Microsoft will ever implement them, and so coding to that model may leave you high and dry when it comes to port to the world's most common desktop operating system.

It's the desktop that should be benfiting most from local sockets. Anything involving multiple machines must and should use IP sockets. This leaves the web as a wide open playing field, but the desktop is still a place where communication is difficult to orchestrate.

Speaking of orchestras, how about a case study?

The symphony desktop is being developed by Ryan Quinn, under the guidance of Jason Spisak of Lycos fame. The general concept is that the desktop is a tweaked copy of Firefox. A "localhost-only" web server serves regular mozilla-consuable content such as Javascript-embedded HTML as the background. You run applications over the top of the background, and bob's your uncle. You can develop applications for the desktop as easily as you can put them together on the web.

A localhost-only web server is presumably one that just opens a port on the loopback interface, 127.0.0.1 in IPv4 terminology. This prevents intruders attacking the web server accessing any data the web server may hold... but what about other users?

It turns out that other users can access this data just fine, so despite the amount of promise this approach holds it is tipped at the post for any multi-user work. Not only can other users access your data (which admittedly you could solve via even a simple cookie authentication system), other users can't open the same port and therefore can't run their own desktop while you are running yours

I've been bitching about this problem of accessing local application data in various forms lately. I'm concerned that if we can't bring the best web principles back to the desktop, then desktop solutions may start migrating out onto the web. This would be a shame (and is alredy happening with the "I'm not corba, honest" SOAP-based web services). I really want to see the URI concept work without having to put one machine on the network for each active user! :)

So what are the requirements when deploying a desktop solution for inter-process communication? Here's a quick list:

  1. The URI for a particular piece of user data must be stable
  2. The URI for a different pices of user data and the data of different users must not clash
  3. The URI scheme must be widely understood
  4. A user must be able to run multiple services, not just one
  5. It should be possible to refer to local files when communicating to the service
  6. The service should run as the user, and have access to the user's data
  7. The service must not grant access to the user's data when their own permissions would not allow access
  8. It should be possible to move data from the "private data" set to the "exposed to the internet" data set easily.

Using IP-based sockets fails on several counts. If you use a static port number for everyone the URIs clash between users. If you use dynamic port allocation the URI changes because the port number becomes part of the URI. If you pre-allocate ports for each user the users won't be able to run extra services without first consulting the super-user. If you don't preallocate them, you may find they're not available when you want them!

These are the kinds of problems unix sockets were really meant to address, using files as their keys rather than IP and port numbers. Unfortunately, without wide acceptance of this model and without a way to encode this access-point information into a URI that is also widely accepted we run aground just as quickly.

Consider the use cases. The first is that I want to be able to access data via another application, because it understands the data and I only understand the interfact to that application. MIME gets us half-way, by telling us which application to run in order to understand the data... but it is currently a user-facing concept. It tells you how to start the program, but doesn't tell you how to talk to it.

The second is the kind of use that symphony is making, where you don't know about the files and all you care about is the interface. You think of the address of this interface as the data itself. It looks like a file, but there is a dynamic application behind it.

I don't think I have any answers right now.

Update: Google desktop search uses port 4664 for its own local web server. This leads to the same issues as the symphony desktop, with clashes between multiple users. Consider the case where Google ported the desktop search to symphonyos. Now you have two separate services from separate vendors that you want to connect to on differnt ports to prevent coupling... but in order to make them available to mutliple users you have to pre-allocate two ports per user. Urggh.

On the other hand, using a IP-based socket solution rather than local sockets does allow you to move to a thin-client architecture where the desktop and Google search are served on the server while clients just run a web browser. Perhaps the only answer is to serve the data of all users from a single port for a specific application, and using internal mechanisms to cleanly separate the data of each user.

Benjamin