Sound advice - blog

Tales from the homeworld

My current feeds

Tue, 2007-Mar-13

Machine-to-Machine Forms Submission in REST

URI-construction is usually seen as a bad thing in REST practice. However, just about everyone does it in some form or another. You start with some sort of base URL, and you fill out variable parts of it to construct a URL that you actually look up or submit data to. We refer to experience on the Web and say that it is alright to construct a URL sometimes. In particular, it is OK to construct a URL when you have a document that you just obtained from the server that tells you how to do it. This kind of URL construction isn't really a problem. In fact, it is just an advanced form of hyperlinking. So what are the limits of this form of hyperlinking when there is no human in the loop?

Form Population for Machines

The distinction between a human-submitted form and a machine-submitted form is an important one. A form intended for a human will include textboxes and other user-interface widgets alongside supporting text or diagrams that explain how to fill the form out. A machine cannot understand the kinds of loose instruction given to a human, so we have to consider how a machine knows what to put where in the URL.

I think that the simple answer is that the input to any machine-populated form is effectively a standard document type, or at least a standard infoset. The machine that constructs a URL does so from a particular set of information, and the form acts as a transform from its infoset into the output URL.

For example, a machine that is constructing a query for the google search engine must know to supply a google-compatible search string in the "q" field of the form. A client of yahoo must currently know to supply a yahoo-compatible search string in the "p" field. While humans are able to fill in forms that accommodate these differences, machines are more limited. If we are ever to have any hope of machine-submitted forms we will need to look at reducing this kind of deviation in infoset structure.

Transformations from Infoset to URL

This characterisation of a machine-populated form as a transform should impact how we go about selecting and evaluating forms technologies that are destined purely for machines to populate. In particular, XSLT jumps right up the list in terms of possible technologies. It is already a well-established transform technology that can output text as well as other structured document types. If we view the source document as either XML or an XML infoset, XSLT provides an obvious way to achieve the URL construction goal.

Another approach would be to look carefully again at how Web forms work. In practice, they construct an infoset based on user input then use a defined mechanism to transform the infoset into into the query part of a URL. This could be a significantly simpler approach than even embedding an XSL engine. Let's say we see the source document as an XML infoset again, we can follow the rules that XForms defines for transforming an XML document into a query. These rules are essentially that elements with text nodes are turned into URL parameters.

Coupling Effects

On first blush this standard transform approach looks like it couples client and server together, or requires different server implementations to obey the same rules for their URL-space... and that is not entirely false. The factors that limit these effects are the use of a standard document type as input to the transformation, and the ability for server-side implementations to redirect.

In the client-to-server coupling case, we often see service-specific URL construction occurring in clients today. Instead, this construction should be able to be applied to different services that have similar construction requirements. I should be able to start up a search engine rival to google and use the same infoset as input to my URL construction. The client code should accept a base URL as input alongside other parameters that form the infoset, meaning that all clients need to do to use my rival service is change their base URL. Code changes should not be required.

In the server-to-server coupling case, this is an interesting problem. We usually see content types and methods needing to be standard, but give freedom to servers to construct their URI-spaces in whatever way they see fit. The XSLT form submission method would give them that freedom up-front, however redirection is also a way of achieving that freedom. A simple 301 Moved Permanently would allow the server freedom in how they construct their URLs. Greater freedom, in fact, than XSLT running in a client implementation could because it would have more information at its fingertips with which to decide on the redirection URL. To achieve this, all we really need to sacrifice on the server side is a single base URL with a query space that matches a standard infoset that machine clients can be expected to have at hand ready for submission.


My considered view is that using the query part of a URL as a way to pass a standard infoset to a server is a valid way of constructing URLs. I think it is the simplest and most robust way to transform an infoset into a URL, and possibly the most powerful. Current attempts to allow the server greater freedom as to how it constructs its URLs are interesting, but at this point I do not intend to implement anything but the query-part-as-content approach in my development. I think the focus should shift away from this technical sphere of URL construction to a process of defining the content types that are fed into these transforms.