Sound advice - blog

Tales from the homeworld

My current feeds

Tue, 2006-Apr-04

Low and High REST

There has been a bit of chatter as of late about low and high REST variants. Lesscode blames Nelson Minar's Etech 2005 presentation for the distinction between REST styles. It pretty much amounts to the read-only web verses the read-write web, or possibly the web we know works and the web as it was meant to work (and may still do so in the future).

The idea is that using GET consistently and correctly can be called "low". It fits the REST model and works pretty well with the way information is produced and consumed on the web of today. Using other verbs correctly, especially other formally-defined HTTP verbs correctly, is "high" REST. The meme has been spreading like wildfire and lesscode has carried some interesting discussion on the concept.

Robert Sayre notes that the GET/POST/PUT/DELETE verbs aren't used used in any real-word applications. He says that low REST might be standardising what is known to work, but high REST is still an untested model. Ian Bicking calls the emphasis on using verbs other than POST to modify server-side state a cargo cult.

It is useful to look back at Fielding's Dissertation, in which he doesn't talk about any HTTP method except for GET. He assumes the existence of other "standard" methods, but does not go into detail about them.

I think Ian is hitting on an uncomfortable truth, or at least a half-truth. Intermediataries don't much care whether you use POST, DELETE, or PUT to mutate server state. They treat the requests in similar ways. If you were to use webdav operations you would probably find the proxies again treating the operations the same way as if you had used POST. Architecturally speaking, it does not matter which method you use to perform the mutation. It only matters that the client, intermediataries, and the server are all of the understanding that mutation is occuring.

Even that constraint needs some defense. Resource state can overlap, so mutating a single resource state in a single operation can in fact alter several resources. Neither client or intermediatary is aware of this knock-on effect. The only reason that clients really need to know if mutation is happening or not is for machines to determine whether they can safely make a request without their user's permission. Can a link be followed for precaching purposes? Can a request be retried without changing its meaning?

Personally I am a fan of mapping the operations DELETE to cut, GET to copy, PUT to paste over, and POST to paste after. I know that others like to map the operations to the database CRUD model: POST to create, GET to retrieve, PUT to update, and DELETE to delete. It amounts to the same thing, except that the cut and paste view steers us more firmly away from record-based updates and into the world of freeform stuff to stuff and this to that data flows. Viewing the web as a document transfer system makes other architectures simpler, and makes them possible.

I have mentioned before that I don't think the methods should end at that. There are specialty domains such as subscription over HTTP that seem to demand a wider set of verbs. Mapping to an object-oriented world can also indicate more verbs should be used, at least until the underlying objects can be retooled for easier access through HTTP. Robert Sayre points at this too, but I think he is a little off the mark in his thinking. I think that limiting the methods in operation on the internet is a bad thing, however limiting the methods a particular service demands clients use is a good thing. Every corner will have its quirks. Every corner will start from a position of many unnecessary SOA-style methods before really settling into the way the web really handles things. It is important for the internet to tolerate the variety while encouraging a gradual approach to uniformity.

We should have some kind of awareness of what methods we are using because it helps us exercise the principle of least power. It helps us decouple client from server by reducing the client requests to things like: "store this document at this location", "update that document you have with the one I have". By moving towards less powerful and less specific methods as well as less powerful and less specific document types we reduce the specific expectations a client will have of its server. Sometimes it is necessary to be specific, and that should be supported. However, it is a useful exercise to see how general a request could possibly fulfil the same role.

My issue with using POST for everything is that what we really often mean is that we are tunnelling everything through POST. I see it as important that the opertations we perform are visible at the HTTP protocol level so that they can be handled in a uniform way by firewalls and toolkits and intermediataries of all kinds. Information about what the request is has to be encoded into either the method or into the URI itself, or we are just forcing ourselves to interrogate another level of abstraction in the operation of our intermediataries.

You could take this discussion and use it to support making POST a general "mutate" method. If one mutation operation applies to a single URI then it makes sense to use a very general mutation method. In this case we are encoding information about what the operation is into the URI itself rather than selecting the mutation by the method of our request. Instead of tunneling a variety of possible operations through the POST, it is the URI that tunnels the information. Since that is managed by the server side of the request, that is the really best possible outcome. It is only when multiple methods apply to a single URI that we need to carefully consider methods other than POST and ensure that appropriate methods can be used even if they haven't been standardised. Future-proofing of the URI space may dictate the use of the most appropriate method available. Unfortunately, existing toolkits and standards push POST as the only method available.

In my view a client or intermediatary that doesn't understand a method it is given to work with should always treat it as if it were POST. That is a safe assumption as to how much damage it could do and what to expect of its results. That assumption would allow experimentation with new methods through HTTP without toolkit confusion. I am not a supporter of POST tunneling, and believe generally that it is lack of support for unknown methods in specifications and in toolkits that makes tunneling necessary and thus successful on the internet of today.

Benjamin