Sound advice - blog

Tales from the homeworld

My current feeds

Sat, 2005-Apr-09

What do you store in your REST URIs?

I have been tinkering away on my HTTP-related work project, and have a second draft together of an interface to a process starting and monitoring application that we built and use internally. Each process has a name, and that is simple to match to a URI which contains read-only information about its state.

You can picture it in a URI like this: http://my.server/processes/myprocess, returning the XML equivalent of "the process is running, is is currently the main in a redundant pair of processes". It actually gets a little more complicated than that, with a single HTTP access point able to return the status of various processes across various hosts, to report on the status of the hosts themselves, and also to arrange the processes in various other ways that are useful to us and make statements about those collections.

My next experimental step was to allow enabling and disabling of processes by adding a */enabled uri for each process. When enabled it would return text/plain "true". When disabled it would return text/plain "false". A PUT operation would change this state and cause the process to be started or stopped. I was hoping I'd be able to access this via a HTML form, but urg... no luck there. I had to add a POST method to the process itself with an "enabled=true" uri-encoding. Not nice, but together they're workable for now.

Now we're at the point where I ask the question: How do I find and represent the list of processes? I ask, "How do I navigate to the Host URI associated with this process?". I ask, "How do I know what to append to find the enabled URI?".

I have been returning pretty basic stuff. If my data is effectively a string, (ie, a datum encoded using XSD data type rules) I've been returning a text/plain body containing that data. If the data is more complicated, and needs an XML representation I've been returning application/xml with the XML document as the body. In my HMI application I typically map those strings and XML elements onto the properties of java beans, or onto my own constructions that map their data eventually onto beans. The expected data format is therefore pretty well known to my application and doesn't need much in the way of explicit schema declaration. The URIs are also explicitly encoded into the XML page definitions that go into the HMI. If I start to look outside the box, though, particularly to debugging- or browsing- style applications that might exist in the future I want to be able to find my data.

As I was working through the problem, I started to understand for the first time where the XLink designers were coming from. My fingers were aching to just type something like

<Processes type="linkset"><a href="foo"/><a href="bar"/></Processes>

and be done with it. XLink is dead, though, and apparently with good reason... so it starts to look like RDF is the "right way to do it".

Rethinking the REST web service model in terms of RDF is an interesting approach, and one I feel could work fairly nicely. I'm still thinking in terms of properties. If I had an object of class foo with property bar, then I could write the following fairly easily:

<foo rdf:about=".">
<foo.bar>myvalue</foo.bar>
</foo>

That's almost identical to the verbose form of the XML structures I'm using right now (I would currently put myvalue into foo/@bar to reduce the verbosity). In this way, the content of each URI would be the rdf about that URI. If this were backed by a triplestore, you might simply list all relationships that the URI has directly in this response body.

It seems simple to produce an rdf-compatible hyperlinking solution for the GET side of things, so what about the PUT?

On first glance this looks simple, but in fact the PUT now needs more data than it did previously. What I really want to do is to PUT an enabled assertion between my process and the literal "false". What do I put, exactly? Perhaps something like this:

PUT /processes/myprocess/http://my.company/myNamespace/myClass.myProperty HTTP/1.1
Host: my.server

<Literal>true</Literal>

You can see the difficulty. I need to encode the URI of the subject (http://my.server/processes/myProcess) and the URI of the predicate (http://my.company/myNamespace/myClass.myProperty). Finally I need to encode the object, clearly identified as a URI, a literal, or a new RDF instance with its own set of properties.

Another thing you need to do is work out the semantics of the PUT operation as well as the POST operation. In the truest HTTP sense it is probably sensible for PUT to attempt to overwrite any existing assertions on the object with the same predicate, while POST would seek to accumulate a set of assertions by adding to rather than overwriting earlier statements.

There is another question unanswered in all of this. If I have a piece of RDF relating to a specific URI, what do I have to do to get more information about it? Sometimes you'll be able to deference the URI and find more RDF. Sometimes you'll get a web page or some other resource, and if you're luckly you'll find rdf at a ".rdf"-extension variant of the filename. Sometimes you'll find nothing at the link. Shouldn't these options be formalised somewhere? I don't think it's possible to write an rdf "crawler", otherwise... or the source RDF document must point to both the target rdf and the related object. In other questions arising from this line of thought, "Is there a standard way to update RDF in a REST way? If so, does it work form a web browser with simple web forms?"

The web browser is becoming my benchmark of how complicated a thing is allowed to be. If you can't generate that PUT from a web form, maybe you're overthinking the problem. If you can browse what you've created happily in mozilla, perhaps you need to simplify it.

Links:

Benjamin