I have received interesting feedback on my last article from a number of sources. Sometimes it is just as interesting to see responses to what I say and what readers hear rather than what I mean. A Mr "FuManChu" took me to task over my treatement of POST and I responded with a comment to his blog. Mike Dierken is concerned about my advice to reduce the number of content types in play to be more RESTful. He writes:
REST actually drives toward increasing content-types. Not necessarily maximizing, but definitely opening up the space of possibilities. So I would say that REST seeks to push the balance away from verbs and as close to the edge of nouns and content-types.
I would respond firstly with this quote from Roy Fielding:
REST enables intermediate processing by constraining messages to be self-descriptive: interaction is stateless between requests, standard methods and media types are used to indicate semantics and exchange information, and responses explicitly indicate cacheability.
I will go on to fill out what I think is a simple difference in terms shortly.
Mike does make the excellent point that
REST does not have a heirarchical namespace. Neither does HTTP. They only have identifiers that are opaque to the client.
Even though the path space in URIs are explicitly hierarchical with relative links forming an integratal part of the specification, REST is based on hyperlinks. We should not be constructing one URI from another. We should be following hyperlinks instead. This was a weakness in my article first pointed out to me by Ken MacLeod via email.
Mark Baker says, and Henry Story echoes:
Nope, properties are properties, resources are the Beans themselves. Imagine a "Person" Bean which might expose the properties; name, birth date, birth city. Now imagine that Bean as a resource, with its own http URI; invoke GET on it and you receive, say, an XML document
You are right that the person may be a resource in their own right. I would say simply that if you are doing GET and PUT operations on the person object it is a resource. The reason I say that a bean's properties are resources (possibly as well) is that they do support GET (get) and PUT (set) operations in their own right. Depending on your application it may be appropriate to expose just the bean or just its properties as resources. It may be appropriate to expose both or neither. My contention is simply that uniform GET and PUT operations applied to any named entity makes that entity a resource, or at worst a pseudo-resource. They highlight the natural tension that already exists in software between the O-O and REST world views.
Several other people have also commented using their blogs. Thanks to technorati at least some of links have been easy for me to find. Please keep the feedback coming, and I recommend technorati membership to anyone who's blog is not currently being indexed by those kind folk. I've picked up several responses only through my site statistics, and many of these I'll have missed as I can only see the last twenty under my current hosting arrangements.
Content Types
I used the term "content type" deliberately in my previous article to evoke the content types identified in the http header. The most common one is surely text/html. The ubiquitous nature of html has been a major component of the success of the web. Instead of having each web site offer content in its own proprietary format, we could rely on web sites that wanted to serve hypermedia using something approximating html. While there are many variants as content providers have asked more of the format, the basics have remained the same and we've been able to settle on something that at worst displeases everyone equally.
When html is not well-formed, or is not completely understandable it may not render correctly. The human viewer can probably still make out everything they need to, and can drop back to the source if they get particularly desperate. The application/xml content type doesn't have such luxuries. As Sean McGrath noted recently in his blog:
XML is not - repeat NOT - a 'file format'
While it is not a file format, it is a content type. It seems that just when you can't stand to send or receive anything at all out of the ordinary, we get lax about what the actual content is going to be.
I think that in the end it doesn't matter, for the same reason that nouns are more important than content types. I think that type is the least of your worries when a client makes a request of a server. The server could reject the request for a huge variety of reasons. Perhaps a number that forms part of the content is greater than ten. Perhaps only chinese text is accepted. The semantics of what a particular URI accepts with its PUT verb are entirely up to the URI owner, and can only be construed from that name of that specific resource. Making sure you send XML instead of ASN.1 or that your XML matches a specific schema is the least of your worries in executing a successful transaction with the resource.
I spoke last time about pushing the balance of the (noun, verb, content type) triangle away from both verb and content type. Again, this is from the perspective of Object-Orientation's typing system where it is necessary to define conversion operations, the creation of new types requires the matching creation of new functions, and if you don't have an exact match things just won't work and may not even compile. Given that knowledge of content type is not sufficient to ensure acceptance by the resource you're talking to, and that it is always the named resource that performs the test for accpetance, I think that content types are probably irrelevant when it comes to mapping REST into an Object-Oriented environment. "Format" is more important. In my view, when something is irrelevant to the correctness of your program the bar should be set as low as possible.
Mapping REST into an Object-Oriented Environment
As I've stated before, I believe if REST actually works it will work in a regular (non-web) programming environment. Further, I think that if it fails outside the web it has a good chance of failing on the web itself for application development.
So, let's set a few rules for ourselves. What restrictions do we need to put on ourselves to avoid fooling ourselves that we're being restful?
- All object APIs should be expressed as resources
- All resource access should be via a consistent noun concept
- No exotic verbs (functions) should be allowed on resources
- Navigating through the resource space except via hyperlinks is illegal
Exceptions can be made for objects involved in the mechanics of the RESTful object system, and when data is not in motion (it is in some object's member variable) it can be in whatever form the object desires.
Nouns
So, what should our nouns look like? Our first thought should be that URIs are like pointers or references. We could make those our nouns. Each pointer could refer to an object of type "Resource" with the appropriate verbs on its surface. This model suffers, somewhat, when we think about exactly what we're going to use and return in the PUT, GET, and POST functions.
The second iteration for me is the KISS approach. Let everything be a string, from noun to content type. After all, a noun could and should be used as content from time to time. There is no reason to make artificial distinctions between the two concepts. In my prototype I've been using a scheme called "object" to refer to object-provided resources:
- object:/gtk/gdk/GRAVITY_NORTH_WEST
- object:/myXMLResource/xpath?//b
The resolver of object URIs starts with a set of objects with regestered names at the top level. To assist in the mapping onto a pure O-O model it navigates down the object's hierarchy of properties until it meets one that is prepared to perform the standard verb operations for the remainder of the unconsumed part of the URI path or until it reaches the end of its path. If no handler is found, it handles the verbs on the end node itself. This approach allows any object along a path to take over the handling of the rest of the URI text. In the example above, and object of type XMLResource handles its own path. It looks for an xpath entry immediately below it in the uri and parses any query part as an xpath expression. This kind of thing is profoundly useful. It can be a private hell to actually try and create one resource object for each actual resource.
Verbs
I define the following global operations:
- GET as GET(uri), returning string.
- PUT is PUT(uri, data), returning nothing.
- DELETE is DELETE(uri), returning nothing.
- POST is POST(uri, data), returning the created URI.
These map to operations on objects of:
- GET as GET(remainingPathSegments, query), returning string.
- PUT is PUT(remainingPathSegments, query, data), returning nothing.
- DELETE is DELETE(remainingPathSegments, query), returning nothing.
- POST is POST(remainingPathSegments, query, data), returning the created URI.
Errors are currently reported via exceptions. All operations are currently synchronous, including access to remote services such as http.
Content Type
There is probably a performance hit associated with the "all string, all the time" approach to content types. Especially in cases where you have a complex data structure behind one resource and the same kind of structure behand another it may be a useful shortcut to avoid the conversion to string. I think this should be doable with a minimum of code. Caching of transformations may also be of use if performance becomes a problem. I think that overall the advantages of a REST approach should make these problems irrelevant, but you may not want to code any tight loops with GET and PUT operations inside them any time soon :)
Variables and parameters
Variables in a RESTful application become aliases for content or aliases for URIs. They aren't evil, but they don't work like O-O variables and parameters to today. You can't operate on them except by passing them to other functions, or using them in the set of standard verb operations. Instead of passing an object to a function, you'll end up passing in specific sub-resources of the object. Perhaps you'll even pass in structures that form collections of resources, to help things swing back towards the O-O way of doing things which has served us so well over the years. The main thing will be a switch away from the view of a type being a single monolith, and instead seeing the individual resources or aspects of that type as separate entities from the function's point of view.
Event Handling
I'm currently working with an event handling model based on a GET to one resource followed by a PUT, POST, or DELETE to another. Given the REST approach this technique can actually be quite powerful. It is possible to cause a button press to copy a sample piece of XML into a specific (xpath-located) part of an XML resource. It can copy the entire resource to or from the web or a file. It can update its attributes and text nodes with input from the user. It effectively allows you basic XForms capability. I think it is even simple enough to build IDE and GUI builder tools to create these handlers.
You can interact with other kinds of objects as well. It is possible to express a queue or stack as resources. You can POST to the top of a stack to create new entries. DELETE as necessary to remove them. You can interact with a wide variety of data structures and objects. You can copy that XML document onto an object which accepts an XML representation of that specific format to make make it assume a particular state. From a declarative and very simple world view you can trigger effects in your application that are extremely powerful. The power doesn't come from the simple event handler. Instead it comes from exposing application functionality in a RESTful way. The uniformity of this interface makes it possible for software that was never designed for interoperatbility to interact... and without a base-class in sight.
Benjamin