Sound advice - blog

Tales from the homeworld

My current feeds

Sun, 2005-Jun-26

How do I comment?

I don't currently support a direct commenting mechnism, but I encourage and welcome feedback. The best way to comment is to respond in your own blog, and to let me know you've done it. If you are a member of technorati I should notice your entry within a few days.

Another alterntaive is email, and when I want to comment on blogs that don't support comments I will usually do both. A recent example is this entry which I first sent as an email to Rusty, then posted the content of my email into the blog. You can see my email address on the left-hand panel of my blog. You'll need to substitute " at " for "@", which I hope is not too onerous.

The main reason I don't accept comments directly is history. I use static web space provided by my ISP and don't have much control over how the data is served and the kind of feedback that can be garnered. I also have a mild fear that comment spam will become a significant adminstration overhead as I have to keep up with the appropriate plugins to avoid too much hassle in this regard. Blogging is an easy and fun way to publish your thoughts, and I hope that a network of links between blogs will be as effective as a slashdot-style commenting system for the amount of interest and feedback my own blog attracts.

Thanks! :)

Benjamin

Sun, 2005-Jun-26

Redefine copying for Intellectual Property!

Rusty recently had an exchange with the Minister for Communications, Information Technology and the Arts. He is worried about the lack of online music stores in Australia, and that the big record companies may be stifling competition in ways that the law has not caught up with. Here is the email I sent to Rusty, reformatted for html.

Hello Rusty,

I wonder if a less radical suggestion from the Minister's point of view would be to try and redefine what copying means for a intellectual property distributor. Instead of "you can't make what I sell you available for download" it could essentially mean "you can't permit any more copies of what I sold you to be downloaded than you actually bought from me":

(Number of copies out (sold, or given away) = Number of copies in (bought)) -> No copying has taken place with respect to copyright law.

If this approach was applied to individuals as well as distribution companies then fair use may not need any further consideration. If there is a feeling that this can't be applied to individuals then the major hurdles, I think, would be:

  1. How do we define an IP distributor, as compared to a consumer of IP?
  2. Who is allowed to define the mechanism for measuring the equation? Is it a statutory mechanism, or do we leave it up to individual contracts?

Just a happy sunday morning muse :)
I'm sure you've already thought along these lines before.

Mon, 2005-Jun-20

A RESTful non-web User Interface Model

I've spent most of today hacking python. The target has been to exploit the possiblity of REST in a non-web programming envioronment. I chose to target a glade replacement built from REST principles.

Glade is a useful bare bones tool for constructing widget trees that can be embedded into software as generated source code or read on the fly by libglade. Python interfaces are available for libglade and gtk, but there are a few things I wanted to resolve and improve.

  1. The event handling model, and
  2. The XML schema

The XML schema is bleugh. Awfully verbose and not much use. It is full of elements like "widget" and "child" and "parameter" that don't mean anything, leaving the real semantic stuff as magic constant values in various attribute fields. In my prototype I've collapsed each parameter into an attribute of its parent, and made all elements either widgets or event handling objects.

The real focus of my attention, though, was the event handling. Currently gtk widgets can emit certain notifications, and it is possible to register with the objects to have your own functions called when the associated event goes off. Because a function call is the only way to interact with these signals, you immediately have to start writing code for potentially trivial applications. I wanted to reduce the complexity of this first line of event handling so much that it could be included into the glade file instead. I wanted to reduce the complexity to a set of GET, and PUT operations.

I've defined my only event handler for the prototype application with four fields:

To make this URL-heavy approach to interface design useful, I put in place a few schemes for internal use in the application. The "literal:" scheme just returns the percent-decoded data of its path component. It can't be PUT to. The "object:" scheme is handled by a resolver that allows objects to register with it by name. Once a parent has been identified, attributes of the parent can be burrowed into by adding path segments.

The prototype example was a simple one. I create a gtk window with a label in it. The label initially reads "Hello, world". On a mouse click, it becomes "Goodbye, world". It's not going to win any awards for oringinality, but it demonstrates a program that can be entirely defined in terms of generic URI-handling and XML loading code and an XML file that contains the program data. An abbreviated version of the XML file I used looks like this:

<gtk.Window id="window1">
        <gtk.Label id="label1" label="literal:Hello,%20world">
                <Event
                        name="button_press_event"
                        url="object:/label1/label"
                        verb="PUT"
                        data="literal:Goodbye,%20world"
                        />
        </gtk.Label>
</gtk.Window>

You can see that the literal URI schem handling is a little clunky, and needs to have some unicode issues thought out. The Event node is triggered by a button_press_event on label1, and sends the literal "Goodbye, world" to lable1's own "label" attribute. You can see some of the potential power emerging already, though. If you changed either the initial label url to data extracte from a database or from http we have immediately developed not a static snapshot in this file but a dynamic beast. The data in our event handler could refer to an object or XML file that had been built up XForms-style via other event interactions. It is possible that even quite sophisticated applications could be written in this way as combinations of queries and updates to a generic store. Only geniunely complicated interactions would need to be modelled in code, and that code would not be clouded by the needless complexity of widget event handling.

It is my theory that if REST works at all, it will work in this kind of programming environment as well as it could on the web. I'm not trying to preclude the use of other languages or methods for controlling the logic of a single window's behaviour, but I am hoping to clearly define the level of complexity you can apply through a glade-like editor. I think it needs to be higher than it is presently to be really productive, but I don't think it should become a software IDE in itself.

All in all my prototype came to 251 lines of python, including whitespace and very minimal comments. That includes code for the event handler (12 lines), a gtk object loader (54 lines, and generic python object actually), and URI resolution classes including http support through urllib2. The object resolver is the largest file, weighing in at 109 lines. It is made slightly more complex by having to deal with pygtk's use of setters and getters as well as the more generic python approaches to attribute assignment.

My example was trivial, but I do think that this kind of approach could be considerably more powerful than that of existing glade. The real test (either of the method or of gtk's design) will probably come in list and tree handling code. This will be especially true where column data comes from a dynamic source such as an sqlite database. I do anticpate some tree problems, where current even handlers often need to pass strange contexts around in order to correctly place popup windows. It may come together, though.

Benajamin

Mon, 2005-Jun-13

Protocols over libraries

As I've been moving through my web pilgrimage, I've come to have some definite thoughts about the use of libraries as opposed to the use of protocols. The classic API in C, or any other language provides a convenient language-friendly way of interacting with certain functionality. A protocol provides a convenient language-neutral way of delegating function to someone else. I've come to prefer the use of a protocol over a library when implementing new functionality.

Let's take a case I've been discussing recently: Name resolution.

Name resolution (mapping a name to an IP address, and possibly other information) has traditionally been handled by a combination of two methods. /etc/hosts combined a static list of hosts that acted as a bootstrap before relevant DNS servcies might be available. For various reasons people on smaller networks found /etc/hosts easier to manage than a bind configuration, so /etc/hosts grew.

As it grew, people found new problems and solutions.

Problem: We need this /etc/hosts file to be consistent everywhere.
Solution: Distribute it using proprietary mechanisms, NIS, or LDAP.

Problem: Our applications don't know how to talk to those services.
Solution: We'll add new functionality to our resolution libraries so every application can talk to them using the old API.

Problem: We don't know what this library should try first.
Solution: We'll create a file called /etc/nsswitch.conf. We'll use it to tell each application through our library whether to look first at the static files, or at the DNS, or at NIS, whatever.

So now there's a library that orchestrates this behaviour, and it works consistently so long as you can link against C. You can implement it natively in your own language if you like, but damn well better track the behaviour of the C library on each platform.

Another way to solve these problems might be:

Problem: We need this /etc/hosts file to be consistent everywhere.
Solution: We'll take a step back here. Let's write a new application that is as easy to configure as /etc/hosts. Maybe it reads /etc/hosts as its configuration.

Problem: Our applications don't know how to talk to those services.
Solution: We'll change our library to talk to our new application via a well-defined protocol. We might as well use the DNS protocol, as it is already dominant for this purpose. As we introduce new ways of deploying our configuration we change only our application.

Problem: We don't know what this library should try first.
Solution: Still create a file called /etc/nsswitch.conf. Just have the application (let's call it localDNS for fun) read that file. Don't make every program installed use the file. Just make sure they speak the protocol.

Because we have a clear language-neutral protocol we can vary the implementation of both client and server gratitously. We can still have our library, but our library talks a protocol rather than implementing the functions of an application. Because this is a simple matter, we can produce native implementations in various language we use and craft the API according to the conventions of that language. We don't have to track the behaviour of our specific installed platform because we know they'll all speak the standard protocol.

localDNS should consume the existing name resolution services. The name resolution library should drop all non-DNS features, including looking up /etc/hosts and caching and the like.

Running extra processes to implement functionality can have both beneficial and deleterious effects. On the positive side, it can make indivdual applications smaller and their code easier to replace. It can act as a thread of the many processes it interacts with and run simultaneously with them on multiple CPU hardware. It can be deployed and managed individually. It's a separate "Configuration Item". On the other hand, extra infrastructure needs to be put in place to make sure it is always available when required, and the process itself can become a bottleneck to performance if poorly implemented.

Benjamin

Sun, 2005-Jun-12

URI Ownership

A URI is a name controlled by an individual or organisation. They get to define a wide sweep of possible constructions and are allowed to map those contructions onto meanings. One meaning is that of the URL lookup mechanisms. Current technology allows a URI owner to vary the IP address that http connections are made to via DNS. DNS also has the capability to specify the port to connect to for a specific service name. It does not specify the protocol to use. That is fixed as part of the URI scheme.

This entry is my gripe list about the SRV records that permit ports to be decided dynamically by the URI owner rather than being encoded statically into the URI itself. I was introduced to this capability by Peter Hardy who was kind enough to respond to some of my earlier posts. Since then I've done some independent research I'd like to put onto the record for anyone still tuned in :)

SRV records were apparently first specified in rfc 2052 in October 1996, later updated by rfc 2782 in Feburary 2000. Its introduction was driven by the idea that the stability and fault tolerance of the SMTP system could be applied to all protocols by enhancing MX Records to deal with services more generically. Perhaps as a side-effect, or as a sweetener for content providers the capability to specify ports other than that normally allocated for a protocol was included. SRV promised to allow providers to move functionality between physical machines more easily, and to handle load balancing and redundancy issues consistently.

Fast forward to the year 2005, and SRV records are still struggling to find acceptance. Despite DNS server support, most client applications aren't coming on board[1]. Despite a bug being raised back in September 1999, Mozilla still does not support SRV. It would seem that content providers have little incentive to create SRV records for existing protocols. Big content providers don't need to have people http connect on ports other than 80, and would find it impractical if they did due to corporate firewalling rules. They aren't concerned about problems in moving functionality between hosts, about redundancy, or about load balancing via DNS. They have their own solutions already, and even if clients started supporting SRV records they would have to hold on to their old "A" records for compatability. With content providers unlikely to provide the records, providers of client software seem unwilling to put effort into the feature either.

The client software question is an interesting and complex one. For starters, the classic name resolution interfaces are no good for SRV. The old gethostbyname(3) function does nothing with ports, and even the newer getaddrinfo(3) function typically doesn't support SRV, although the netbsd guys apparently believe it is appropriate to include SRV in this API. Nevertheless, there is rfc-generated confusion even in pro-SRV circles about when and how it should be used.

To add a little more confusion, we have lists of valid services and protocols for SRV that associate concepts of service and content type instead of service and protocol, separating http for html from http for xul. If you start down that track you might as well give up on REST entirely :)

So what is SRV good for? The big end of town seems to be faring well without and the small end of town (small home and corporate networks) often don't use DNS at all, preferring simple /etc/hosts and /etc/services file constructions distributed via NIS, LDAP, or a proprietary or manual method.

So... I guess I should put together a list of action items. In order to support resolution of port as well as IP as part of URL handling we need to

  1. Use an API that looks like getaddrinfo(3) consistently across our applications and protocols. It must include both domain name and service name
  2. Make sure we use service names that exactly match our URI scheme, eg http and ftp. Don't get into specifying content. That's not the role of this mechansim.
  3. Add support to getaddrinfo for SRV records
  4. Specify the use of SVR records as preferred for all protocols :) Don't wait for an update of the HTTP rfc!
  5. Add support to getaddrinfo for an alternative to our current /etc/hosts and /etc/services files, or an augmentation of them. This alternative must be in the form of a file itself and must be easily distributed via the same means
  6. Perhaps also add support for per-user resolution files

Interestingly, DNS already has mechanisms to allow dynamic update to its records. If it were to be used an application started as a user could update part of a selected zone to announce its presence. There would definately be some security implications, though. Unlike the typical network situation where a whole machine can be assumed to be controlled by a single individual, ports on localhost could be opened by malicious peer individuals. On seeing that a particular user has port 1337 in use, the attacker may open that port after the user logs out in the hope that the next login will trigger the user to access the same port. The trusted program may not be able to update resolution records quickly enough to prevent client applications from connecting to this apparently-valid port. As well as clients being validated to servers, servers must be conclusively validated to clients. This may require a cookie system different to the traditional one-way web cookies.

Back on the subject of resolution, it may be possible to set up a small DNS server on each host that was used in the default resolution process. It could support forwarding to other sources and serving and update of information relevant to the local host's services. It need not listen on external network ports, so would not be a massive security hole... but convincing all possible users to run such a service in a mode that allows ad hoc service starting and registration may still be a stretch. They may already have their own DNS setups to contend with, or may simply trust /etc/hosts more.

Benjamin

[1] Except for Microsoft products, strangely...

Sun, 2005-Jun-05

Naming Services for Ports

I've been going on ad nauseum about the handling of ad hoc user daemons for a few days now. I've been somewhat under the weather, even to the point of missing some of both work and HUMBUG, so some of it may just be the dreaded lurgie talking. On the premise that the line between genius and madness is thin and the line between madness and fever even thinner, I'll pen a few more words.

So I'm looking for a stable URI scheme for getting to servers on the local machine that are started by users. I thought a little about the use of UNIX sockets, but they're not widely enough implemented to gain traction and don't scale to thin-client architectures. It seems that IP is still the way forward, and because we're talking REST, that's HTTP/TCP/IP.

Now the question turns to the more practical "how?". We have a good abstraction for IP addresses that allow us to use a static URI even though we're talking to multiple machines. We call it DNS, or perhaps NIS or LDAP. Maybe we call it /etc/hosts, and perhaps we even call it gethostbyname(3) or some more recent incarnation of that API. These models allow us to keep the http://example.com/ URI, even if example.com decides to move from a 127.0.0.1 IP address to 127.0.0.1. It even works if multiple machines are sharing the load of handling this URI through various clever hacks. It scales to handling multiple machines per service nicely, but we're still left with this problem of handling multiple services per machine. You see, when we talk about http://example.com/ we're really referring to http://example.com:80/.

There's another way to refer to that URI, which may or may not be available on your particular platform setup. It's http://example.com:http/. It's obviously not something you want to type all of the time and is a little on the weird side with its multiple invocation of http. On the other hand, it might allow you to vary the port number associated with http without changing our URI. Because we don't change the URI we can gain benfits for long-term hyperlinking as well as short-term caching mechanisms. We just edit /etc/services on every client machine, and set the port number to 81 instead.

Hrrm... slight problem there, obviously. Although DNS allows the meaning of URIs that contain domain names to be defined by the URI owner, port naming is handled in a less dynamic manner. With DNS, the owner can trust clients to discover the new meaning of the URI for when it comes to actually retrieve data. They bear extra expenses for doing so, but it is worth the benefit.

Let's assume that we'll be able to overcome this client discovery problem for the moment, and move over to the other side of the bridge. You have a process that gets started as part of your login operation to serve your own data to you via a more convenient interface than the underlying flat files would provide. Maybe you have a service that answers SQL queries for your sqlite files. You want to be able to enter http://localhost:myname.sqlite/home/myname/mydatabase?SELECT%20*%20FROM%20Foo into your web browser and get your results back, perhaps in an XML format. Maybe you yourself don't want to, but a program you use does. It doesn't want to link against the sqlite libraries itself, so it takes the distributed application aproach and replaces an API with a protocol. Now it doesn't need to be upgraded when you change from v2.x to v3.x of sqlite. Put support in for http://localhost:postgres/*, and http://localhost:mysql/* and you're half-way towards never having to link to a database library again. So we have to start this application (or a stand-in) for it at start-up. What happens next?

This is the divide I want to cross. Your application opens ports on the physical interfaces you ask it to, and implicitly names them. The trick is to associate these names with something a client can look up. On the local machine you can edit /etc/services directly, so producing an API a program can use to announce its existence in /etc/services might be a good way to start the ball rolling. Huhh. I just noticed something. When I went to check that full stop (.) characters were permitted in URI port names I found they weren't. In fact, I found that rfc3986 and the older rfc2396 omit the possibility of not only the full stop, but any non-digit character. Oh, bugger. I thought I might have actually been onto something... and if I type a port name into Firefox 1.0.4 it happily chopps that part of the URL for me.

Well, what I would have said if I hadn't come across that wonderful piece of news is that once you provide that API you can access your varied HTTP-protocol services on their dynamically allocated port numbers with a static URI because the URI only refers to names, not actual numbers. That would have been a leading off point into how you might share this port ownership information in the small-(virtual)-network and thin-client cases where it would matter.

That's a blow :-/

So it seems we can't virtualise port identification in a compliant URI scheme. Now that rfc3986 has come out you can't even come up with alternative authority components of the URI. DNS and explicit ports are all that are permitted, although rfc2396 allows for alternate naming authorities to that of DNS. The only way to virtualise would be to use some form of port forwarding, in which case we're back to the case of pre-allocating the little buggers for their purposes and making sure they're available for use by the chosen user's specific invocation.

Well, root owns ports less than 1024 under unix. Maybe it's time to allocate other port spaces to specific users. The user could keep a registry of the ports themselves, and would just have to live with the ugliness of seeing magic numbers in the URIs all of the time. It's that, or become resigned to all services that accept connections defaulting to a root-owned daemon mode that perform a setuid(2) after forking to handle the requests of a specific user. Modelling after ssh shouldn't be all that detrimental, although while ssh sessions are long-lived http are typically short. The best performance will always be gained by avoiding the bottleneck and allowing a long-lived server process handle the entire conversion with its client processes. Starting the process when a connection is recieved is asking for a hit, just has using an intermediate process to pass data from one process to another will hurt things.

Another alternative would be to try and make these "splicing" processes more efficient. Perhaps a system call could save the day. Consider the case of processes A, B, and C. A connects to B, and B determines that A wants to talk to C. It could push bytes between the two ad nauseum, or it could tell the kernel that bytes from the file descriptor associated with A should be sent directly to the file descriptor associated with C. No extra context switches would be required and theoretically the interference of B could end up with no further performance hit.

Maybe a simple way of passing file descriptors between processes would be an easy solution to this kind of problem. I seem to recall a mechanism to do this in UNIX Network Programming, but that is currently at work and I am currently at home. Passing the file descriptor associated with A between B and C as required could reduce the bottleneck effect of B.

Oh well, I'm fairly disillusioned as I tend to be at the end of my more ponderous blog entries.

Benjamin

Thu, 2005-Jun-02

User Daemons

The UNIX model of interprocess communication comes in two modes. You have your internal IPC, such as pipes and unix sockets. You have your external IPC, which pretty-much boils down to IP-based sockets.

The internal model works well for some kinds of interactions, but unix sockets (the internal ones) have pretty much faded into obscurity. On the other side of the fence IP-based sockets are now universal. So much so, that IP sockets over the loopback interface are commonly used for what is known to be and what will only ever be communication restricted to the single machine.

If IP sockets are so widely accepted, why aren't UNIX sockets?

I think the primary answer is the the lack of a URI scheme for unix sockets. You can also add to that the unlikiness that Microsoft will ever implement them, and so coding to that model may leave you high and dry when it comes to port to the world's most common desktop operating system.

It's the desktop that should be benfiting most from local sockets. Anything involving multiple machines must and should use IP sockets. This leaves the web as a wide open playing field, but the desktop is still a place where communication is difficult to orchestrate.

Speaking of orchestras, how about a case study?

The symphony desktop is being developed by Ryan Quinn, under the guidance of Jason Spisak of Lycos fame. The general concept is that the desktop is a tweaked copy of Firefox. A "localhost-only" web server serves regular mozilla-consuable content such as Javascript-embedded HTML as the background. You run applications over the top of the background, and bob's your uncle. You can develop applications for the desktop as easily as you can put them together on the web.

A localhost-only web server is presumably one that just opens a port on the loopback interface, 127.0.0.1 in IPv4 terminology. This prevents intruders attacking the web server accessing any data the web server may hold... but what about other users?

It turns out that other users can access this data just fine, so despite the amount of promise this approach holds it is tipped at the post for any multi-user work. Not only can other users access your data (which admittedly you could solve via even a simple cookie authentication system), other users can't open the same port and therefore can't run their own desktop while you are running yours

I've been bitching about this problem of accessing local application data in various forms lately. I'm concerned that if we can't bring the best web principles back to the desktop, then desktop solutions may start migrating out onto the web. This would be a shame (and is alredy happening with the "I'm not corba, honest" SOAP-based web services). I really want to see the URI concept work without having to put one machine on the network for each active user! :)

So what are the requirements when deploying a desktop solution for inter-process communication? Here's a quick list:

  1. The URI for a particular piece of user data must be stable
  2. The URI for a different pices of user data and the data of different users must not clash
  3. The URI scheme must be widely understood
  4. A user must be able to run multiple services, not just one
  5. It should be possible to refer to local files when communicating to the service
  6. The service should run as the user, and have access to the user's data
  7. The service must not grant access to the user's data when their own permissions would not allow access
  8. It should be possible to move data from the "private data" set to the "exposed to the internet" data set easily.

Using IP-based sockets fails on several counts. If you use a static port number for everyone the URIs clash between users. If you use dynamic port allocation the URI changes because the port number becomes part of the URI. If you pre-allocate ports for each user the users won't be able to run extra services without first consulting the super-user. If you don't preallocate them, you may find they're not available when you want them!

These are the kinds of problems unix sockets were really meant to address, using files as their keys rather than IP and port numbers. Unfortunately, without wide acceptance of this model and without a way to encode this access-point information into a URI that is also widely accepted we run aground just as quickly.

Consider the use cases. The first is that I want to be able to access data via another application, because it understands the data and I only understand the interfact to that application. MIME gets us half-way, by telling us which application to run in order to understand the data... but it is currently a user-facing concept. It tells you how to start the program, but doesn't tell you how to talk to it.

The second is the kind of use that symphony is making, where you don't know about the files and all you care about is the interface. You think of the address of this interface as the data itself. It looks like a file, but there is a dynamic application behind it.

I don't think I have any answers right now.

Update: Google desktop search uses port 4664 for its own local web server. This leads to the same issues as the symphony desktop, with clashes between multiple users. Consider the case where Google ported the desktop search to symphonyos. Now you have two separate services from separate vendors that you want to connect to on differnt ports to prevent coupling... but in order to make them available to mutliple users you have to pre-allocate two ports per user. Urggh.

On the other hand, using a IP-based socket solution rather than local sockets does allow you to move to a thin-client architecture where the desktop and Google search are served on the server while clients just run a web browser. Perhaps the only answer is to serve the data of all users from a single port for a specific application, and using internal mechanisms to cleanly separate the data of each user.

Benjamin