Sound advice - blog

Tales from the homeworld

My current feeds

Sat, 2005-Aug-27

Service Discovery Using Zeroconf Techniques

I first became aware of the existence of Zeroconf some years back as a system for assigning IP addresses in the absence of any centralised coordination such as a DHCP server. It seemed designed for small networks, probably private ones. I already had my own setups in place to cover this kind of configuraton, so my interest was minimal. I've just stumbled across Zeroconf again, this time in the form of Avahi, a userland interface to zeroconf features. Even then, I didn't pay much attention until I hit Lennart Poettering's announcement that GnomeMeeting has been ported to Avahi. The mandatory screenshot shook me out of my assumption that I was just looking at a facility to assign IP addresses. The most important feature of Avahi is service discovery, and all using our old friend the SRV record combined with Multicast DNS (mDNS).

Well, it's nice to know that other people out there are trying to solve the same problems as you are. It's even nicer to hear they're using standard tools. The DNS Service Discovery page contains a bunch of useful information and draft proposals to try and solve just the problems I have been thinking about in the service discovery sphere.

For the kind of infrastructure I work with the main requirement of service discovery is that I can map a name not just to an IP address but to a port number. That's solved by SRV records, so long as standard clients and libraries actually use them. The main architectural contraint I work with is a requirement for fast failover once a particular host or service is determined to have failed. A backup instance must quickly be identified and all clients change over to the backup as soon as possible, without waiting for their own timeouts. This seems to be partially addressed by pure mDNS, as changes to the DNS records are propageted to clients immediately via a multicast UDP message. Unfortunately such messages are unreliable, so it is possible that an portion of the network from zero clients to all clients will miss hearing about the update. Polling is required to fill in the gaps in this instance. Alternatively, the DNS-SD page points to an internet-draft that specfies a kind of DNS subscpription over unicast. This approach parallels my own suggestions about how to add subscription to the HTTP protocol. In fact, it would be viatally important to be able to detect server failure efficiently whenever HTTP subscription was in operation and a backup available. If no such mechanism was in place any HTTP subscription could silently stop reporting updates with noone the wiser that their data was stale.

The parallels between DNS and HTTP are interesting. Both are based on fetching data from a remote host through a series of caches. Both have evolved over the years from simple request/response pairs towards long lived connections and pipelining of requests and responses. I hope that this new DNS-subscribe mechanism gets off the ground and can clear the way for a later HTTP subscription mechanism based on the same sorts of approach. Another hope of mine is that the Avahi project is successful enough to force software developers to make service discovery an issue or be trampled in the herd. The DNS work in this area is mostly pretty good and certainly important to people like me. The server side supports SRV, but the clients are still lagging behind. Additionaly, interfaces that should be appliable to SRV like the getaddrinfo but are not yet using SRV records should be updated where possible. A file should be added to /etc (perhaps /etc/srv) as part of the name resoloution process to support non-DNS use of these facilities without changing code.

I feel that one of the reasons that SRV has not been widely adopted is the complexity not in the initial lookup, but in the subsequent behaviour required after a failed attempt to contact your selected server. Clients are required to try each entry in turn until all fail or they find one that works. In terms of pure software design, it can be hard to propagate an error that occurs in trying to connect to a service back to the code that performed the name lookup to do so. You have to store state somewhere as to which you've tried so far. That's tricky, especially when your code dealing with DNS and service lookup is already horribly complicated. I don't really know whether this kind of dynamic update for DNS would make things better or worse. In one sense things could be more complicated because once DNS says that a service is failed you might want to stop making requests to it. On the other hand, if DNS can tell you with reasonable accuracy whether a particular instance of the service is alive or dead you might be able to do without the state that SRV otherwise requires. You could work in a cycle:

  1. Lookup name. Get a non-failed instance based on SRV weightings.
  2. Lookup service. If this fails either report an error, or go back to (1) a fixed number of times

There is obviously a race condition in this approach. If the service fails after the initial lookup or the notification of failure doesn't propagate until after the lookup occurs then looking up the service will fail. Alternatively, a service instance might be restored but still appear to be failed and thus noone will try to connect to it. Additionally, these constant failed and not failed indications propagating across the internet could have detrimental effects on bandwidth. On the other hand, they may prevent more expensive HTTP requests that would otherwise take place. Also, this is not the approach you would take with subscription. Preferrably, you would keep constant watch over whether your selected service instance was still within the living set and whenever it ceased to be so drop your subscriptions and select a new instance of the service. This may add complexity back into the chain, but I still think that relying only on the DNS end of things to determine who you talk to rather than a combination of DNS and the success or failure of attempts to actually reach a service instance would make things simpler overall.

I think this would be an interesting area for further research. I have good feelings about how this would work on a small network with multiple services per host and with multiple instances of your services spread around the network. There may also be benefits at the internet level. Chances are that unless there are internet-scale advantages we'll continue to see existing client libraries and programs failing to really engage with the idea.

Benjamin

Sat, 2005-Aug-27

HTTP Function Calls

I've been delving pretty deeply into HTTP lately, implementing such things in a commercial environment mostly from scratch. Interfacing to existing systems with a REST leaning is interesting, and when you're viewing things from the low levels without reference to SOAP and other things that have come since HTTP you get to see what HTTP is capable of out of the box. In particular, I think you get to see that even if you aren't a RESTafarian SOAP is probably the wrong way to approach applying functions to HTTP-accessable objects.

Standard methods

HTTP is built around some standard methods, and a subset of these are the classic methods of REST. REST proponenents usually try to confine themselves to GET, PUT, DELETE, and POST. rfc2616 specifies OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, and CONNECT. Different versions of the specification have different method lists. rfc1945 covering HTTP/1.0 only includes the GET, HEAD, and POST methods. The earlier HTTP/1.1 rfc 2068 didn't include CONNECT but did include descriptions for additional PATCH, LINK, and UNLINK methods. WebDAV adds PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, and UNLOCK. REST is really based not around the four traditional verbs but the total defined set of methods that are commonly understood. The use of anything that is used and standardised and agreed upon is RESTful. If we go back to the Fielding dissertation we read the single sentence that sums up REST:

REST enables intermediate processing by constraining messages to be self-descriptive: interaction is stateless between requests, standard methods and media types are used to indicate semantics and exchange information, and responses explicitly indicate cacheability.

So long as we use standard methods we are being RESTful, and standard methods are any that have been agreed upon. Methods that have been agreed upon can be processed by caches in standard ways, and making caches work is the most significant architectural constraint that goes into defining the REST approach.

In fact, extensions can be added arbitrarily to HTTP. Any method not understood by a proxy along the way causes the proxy to mark the affected resource as dirty in its cache but otherwise to pass the request on towards an origin server. I think the simplest way to map function calls onto HTTP resources is to use the function name (or a variant of it) to the HTTP method rather than including it in the HTTP body. This is made a little tricky by two factors. The first is that if anyone ever does come along and define a meaning for your method then caches might try and implement that meaning. If you're lucky they'll still only treat it as a cache clearing operation and a pass-through. On the other hand you might not be lucky. Also, new clients might come along expecting your method to behave according to the new standard semantics and cause further problems. Methods are effectively short strings with universal meaning. Dan Connolly has this to say:

There is a time and a place for just using short strings, but since short strings are scarce resources shared by the global community, fair and open processes should be used to manage them.

So to make the best of it we shouldn't be using methods that might clash with future meanings. I suggest that using a namespace in the HTTP request method would eliminate the possiblity of future clashes and make the HTTP method something more than the wrapper for a function call. It can be the function identifier.

URIs as methods, giving up "short strings"

The second issue arises from the first: Exactly how do we do this? Well, according to rfc2616 any extension method just has to be a token. It defines a token as any character except for those in the string "()<>@,;:\\\"/[]?={} \n\r". This rules out a normal URI as a method. The URI would at least need to be able to specify slash (/) characters to separate authority and path. I suggest a simple dotted notation similar to that of java namespaces would be appropriate. Now we are no longer being RESTful by heading down this path. We aren't using standard methods anymore. On the other hand, perhaps it is beneficial to be able to mix the two approaches every once in a while. Perhaps it would help us to stop overloading the POST method so badly to make things that don't quite fit the REST shape work.

Practical examples, and the spectrum of standardisation

Your new method might be called ReloadConfiguration.example.com. It might even be tied to a specific class, and be called ReserveForExecution.control.example.com. If your method became something that could be standarised it might eventually be given a short name and be included in a future RFC, at which time the RFC might say that for compatbility with older implementations your original dotted name should be treated as being identical to the short name.

Mostly-RESTful? Mixed REST and non-REST.

I think that author James Snell largely gets it wrong in his IBM developerworks article on why both REST and SOAP have their places. I think his view of REST is a bit wonky, but to lift things up to a more abstract level I think he has some interesting points. He seems to think that some operations can't easily be translated into state maintained as resources. He thinks that sometimes you need to trigger activities or actions that don't work in a RESTful way. Mostly, he want to be able to employ traditional gof-style OO patterns to his distributed architecture. I don't find myself agreeing with any of his examples but I do find myself in trying to retrofit REST onto an existing architecture not wanting to take the big hit all at once. What I want is a migration path. Even when I get to the end of that path I think there will still be some corners left unRESTful. There are places where one application is already a converter to another protocol. That protocol isn't RESTful, so it seems unlikely to me that putting a RESTful facade over the top of it will ever do any good. That protocol is also an industry standard, so there's no chance of tweaking it to be more like we might like in our RESTful world.

What I'm suggesting is that by supporting non-RESTful concepts alongside RESTful ones it becomes possible to shift things slowly over from one side to the other, and find the right balance where the fit isn't so good for REST.

Once you start heading down this path you see some interesting features of HTTP. After allowing you to select the function to call and the object to call it on HTTP allows you to provide a body that works like the parmaeter list, and extra metadata which you might use along the way in the form of headers. Now that we have cleared the method identifier out of the body it can be decoupled from its content. We can effectively have polymorphism based on any of the metadata headers, such as content-type. We can handle multiple representations of the same kind of input. We can also introspect sophisticated content types such as XML to determine how to do further processing. In my case I found that I had to map onto a protocol that supported string positional parameters. For the moment the simplest fit for this s csv content. Just a single CSV line supports passing of positional parameters to my function. In the future as the underlying protocol changes shape to suit my needs I expect to support named parameters as an XML node. Eventually I want to push the HTTP support back to the server side, so any resource can decide itself how to handle content given to it.

Conclusion

I hope that HTTP will support future software revisions that are more and more RESTful into the future, but for the moment can concentrate on supporting necessary functionality without a massive flow-on impact.

Benjamin