Sound advice - blog

Tales from the homeworld

My current feeds

Sat, 2005-Jul-23

Generic Event Notification Architecture

I was recently asked my opinion of the Generic Event Notification Architecutre (GENA). It is a subscription protocol that uses HTTP as its transport. A client makes a subscribe request to a URI, and the server is responsible for returning notifications via separate HTTP requests back to the client. The protocol was submitted by Microsoft to the IETF as a draft in September 2000, and it is a little unclear as to whether it has seen any sort of comitted adoption. It may be that it has since been superceeded in the minds of Microsoft employees by SOAP-based protocols.

GENA uses a HTTP SUBSCRIBE verb to make request of the server. The request is submitted to a specific URI which represents the subscribe access point. The subscription must be periodically confimed with an additional subscription request. One of the HTTP headers in the original SUBSCRIBE response carries what is known as a Subscription ID, or SID. The same SID header must be included in the additional SUBSCRIBE requests. Each subscription can specify the kinds of event notifications this client is interested in receiving, associated with the original resource it subscribed to. SUBSCRIBE requests include the URI that the server should NOTIFY when the event appears.

I have qualms generally about subscription models that require the server to connect back to the client. This confuses matters signficantly when firewalls are involved, but on the purely philiophical level it makes what is fundamentally a client-server relationship into one of two peers. I'll get back to that concern, but I think there are other aspects of the protocol that could do with some fine tuning as well.

The protocol is almost RESTful. It allows different things to be subscribed to by specifying different resources. It allows n-layered arbitration between the origin server and clients, just like HTTP's caching permits. It gets confused, though, and I think the SID is a prime example of this. The SID identifies a subscription, but instead of being a URI it is an opaque string that must be returned to the original SUBSCRIBE URI. If I were writing the protocol I would turn this around and clearly separate these two resources. You have a resource that acts as a factory for subscriptions and is the thing you want to subscribe to, and you have a subscription resource. I would suggest that the subscription resource be a complete URI that is returned in a Location header to match the effect of POST. It might even be reasonable to use the POST verb rather than a SUBSCRIBE verb for the purpose.

Once the subscription resource is created, it should be able to be queried to determine its outstanding lifetime. A 404 could be returned should the lifetime have been exceeded, and a PUT could be used to refresh the lifetime or even alter the set of events to be returned. From the protocol's perspective, though it is probably simplest just to define the effect of a SUBSCRIBE operation on the subscription in refreshing the timeout and leave the rest to best practice or a later draft.

Returning to the issues of how updates are propagated back to clients, I've harped on before about how I believe this needs to be a change to the HTTP protocol rather than just an overlay. I believe that a single request needs to be able to have multiple responses associated with it that will arrive in the order they were sent down the same TCP/IP connection as the request was made on. Dropping the connection drops all associated subscriptions just as it aborts responses to any outsanding requests. I agree that this approach may not suit loosely-coupled subscribe scenarios that don't want the overhead of one TCP/IP connection for each client/server relationship, but the GENA authors appear to also have been thinking along these lines. The draft includes the following:

We need to add a "connect and flood" mechanism such that if you connect to a certain TCP port you will get events. There is no subscribe/unsubscribe. We also need to discuss this feature for multicasting. If you cut the connection then you won't get any more events.

To turn specific focus back on GENA, I think that the HTTP callback mechanism is still underspecified. In particular it isn't clear what the responsibilities of the server are in returning responses. The server could use HTTP pipelining to deliver a sequence of notifications down the same TCP/IP connection, but what should it do when the connection blocks? The server could try to make concurrent connections when multiple notifications need to be sent, but which will arrive first? Will out of order notifications cause the client to perform incorrect processing? Can the client assume that the latest noficication represents the current state of the resource? Infinite buffering of events is certianly not an option, so what do you do when you exeed your buffer size? Do you utilise your bandwith via pipelining or do you limit your notification rate to the network latency by waiting for the last response before sending another? I don't see any mention in the protocol of an "Updates-Missed" header that might indicate to the client that buffering capabilities had been exceeded.

The specification also allows the server to silently drop subscriptions, a point of which clients may be unaware until it comes time to refresh the subscription. For this to work in practice the cases under which subscriptions could be dropped without notification would have to be well understood.

The actual content being delivered by GENA is unspecified, but GENA does include mechanisms for specifying event types. Personally, I think that the set of resources should be included in the definition of the subscribe URI rather than a special "NT" or "NTS" header. I think it's more RESTful to create separate resources for these separate things you might want to subscribe to than to alias the SUBSCRIBE for a single resource to mean different things depending on header metadata. If we were to take a RESTful view, we would probably want to assume that each update notification's body was a statement of the current representation of the resource. In some cases a kind of difference might also be appropriate. If caching is to be supported in this model the meaning of that content would have to be made as clear as possible, and may have to be explicitly specified in a header just as HTTP's chunked encoding is.

In conclusion, GENA is a good start but could do with some tweaking. I don't know whether the rfc is going anywhere, but if it ever does I think it would be interesting to view and refine it through REST goggles.

Benjamin

Sun, 2005-Jul-17

RDF Content

The intersection between RDF and REST is one I've had difficulty finding. RDF seems great on the surface, but problems crop up as soon as I try and think of anything to use it for. I think after my previous article on the purposes of REST verbs and content types I can finally put a finger on my unease.

When a client requests a document, it does so with a specific purpose and a firm idea of what it wants to do with it. The first step in processing the input is to try and transform it into an internal representation suitable for doing the subsequent work. If there's one thing that XML is good at, it is transformation. If there's one thing that RDF is good at, it's aggregation. I think that the reason RDF is not yet hit its mark is that transformation is a more important function than aggregation when it comes to most machine to machine interactions.

RDF can be expressed in XML, and many people will tell you what's wrong with the current standard and try to offer alternative solutions. Some will complain that it is overly verbose. My beef is simply that there are too many ways to say the same thing, and when you have multiple representations to deal with on your input side your transformation code must become more complicated. It strikes me that most of the document describing the current rdf/xml standard is used up explaining how you can do things many ways to try and reduce verbosity.

So while it is possible to create RDF-compatible XML that is easy to transform it isn't possible to tell someone simply that your service returns rdf/xml of a particular rdf schema and hope that you're making things easier for them. You're much better off giving them RELAX-NG instead.

Despite this flaw, I still think RDF is useful. Despite it currently being harder than it needs to be to transform, it does make aggregation possible. XML doesn't support that itself at all. So like the previous article's "we use verbs so that caching works" theme, today's theme will be "we use RDF-compatible XML as our content type so that aggregators work". Aggregators are intermediatories like caches or databases that have to hang onto data and its meaning on behalf of clients who might come along later. Even this could have its problems. A pure RDF aggregator would require client software to still be quite complex in order to process (transform) the returned RDF. I suspect that specialised aggregators like those for rss and atom will be more fundamentally useful in the short term. The solution seems to be to improve the transformabilty of RDF generally, although I don't have a fundamentally good answer as to how. The use of rules engines like jena may have some impact.

Of course, this is only about how RDF intersects with REST. I think RDF is proving itself mightily in ther RDBMS sphere. Both client and server applications can back themselves with RDF triple stores that support ad hoc data insertion and query. No central authority has to be in charge of the schema, and this responsibility can be distributed amongst different groups. In fact, I would say that trying to design a database technology these days without considering RDF would be a bit of a waste of time. All the good sql-only databases have been written already, and most of these have also seen the RDF light.

Benjamin

Sat, 2005-Jul-16

File Tagging

Watching my mother trying to use Windows XP to locate her holiday snaps makes it clear to me that tagging is the right way to interact with personal documents. The traditional "one file, one location" filesystem is old and busted. The scenareo begins with my mother learning how to take pictures from her camera and put them into foldlers. Unfortunately, my father is still the one managing short movie files. The two users have different mental models for the data. They have different filing systems. Mum wants to find files by date, or by major event. Data thinks that movie files are different to static images and that they should end up in different places. The net result is that Mum needs to learn how to use the search feature in order to find her file, and is lucky to find what she is looking for.

Using tags we would have a largely unstructured collection of files. The operating system would be able to apply tags associated with type automatically, so "mpeg" and "video" might already appear. The operating system might even add tags for time and date. The user might add additional tags such as "21st Birthday" or "Yeppoon Trip". Tags are associated with the files themselves and can be added from anywhere you see the file. You'll then be able to find the file via the additional tag. This approach seems to work better than searching or querying does for non-expert computer users. A query has to be constructed from ideas that aren't in front of the user. Tags are already laid out before them.

Here is one attempt at achieving a tagging model in a UNIX filesystem. I'm not absolutely sure that soft links are the answer. Personally I wonder if we need better hard links. If you delete a hard link to a file from a particular tag set it would disappear from there but stay connected to the filesystem by other links to it. It shouldn't be possible to make these links invalid like it is with symbolic links. Unfortunately hard links can't be used across different filesystems. It would be nice if the operating system itself could manage a half-way point. I understand this would be tricky with the simplest implementation resulting in a copy of the file on each partition. Deciding which was the correct one when they dropped out of sync would be harmful. Perhaps tagging should simply always happen within a single filesystem. Hard links do still have the problem that a rename on one instance of the file doesn't trigger a rename to other tagged instances.

Benjamin

Sat, 2005-Jul-16

The Visual Display Unit is not the User Interface

One thing I've noticed as I've gotten into user interface design concepts is that the best user interface is usually not something you see on your computer screen. When you look at an iPod it is clear how to use it, and what it will do. A well designed mobile phone such as the Nokia 6230 my wife carries makes it easy to both make phone calls and to navigate its menus for more sophisticated operations. A gaming console like the PS2, the Xbox, or the Game boy is easy to use. Easer than a PC, by miles.

Joel Spolsky has a wonderful work online descibing how user interfaces should be designed. He designates a whole chapter to "Affordances and Metaphors", which very simply amounts to giving the user hints on when and how to click. In the same document he highlights the problem that makes desktop software so hard to work with generally:

Users can't control the mouse very well.

I've noticed that whenever I'm finding a device easy to use, it is because it has a separate physical control for everything I want to do. Up and down are different buttons, or are different ends of a tactile device. I don't have to imagine that the thing in the screen is a button. Instead, the thing in the screen obviously relates to a real button. I just press it with my stubby fingers and it works.

So maybe we should be thinking just as much about what hardware we might want to give to users as we think about how to make our software work with the hardware they have already. Wouldn't it be nice if instead of a workspace switcher you had four buttons on your keyboard that would switch for you? Wouldn't it be nice if instead of icons on a panel, minimised applications appeared somewhere obvious on your keyboard so you could press a button and make the applications appear? There wouln't be any need for an expose feature. The user would always be able to see what was still running but not visible.

Leon Brooks points to a fascinating keyboard device that includes LCD on each button. This can be controlled by the program with current focus to provide real buttons for what would normally only by "visual buttons". I think this could make the app with focus much more usable both by freeing up screen real estate for things the user actually wants to see and making buttons tangable instead of just hoping they look like something clickable. Personally I would have doubts about the durability of a keyboard like this, but if it could be made without a huge expense and programs could be designed to work with it effectively I think it could take off.

We can see this tactile approach working already with scrollwheel mice and keyboards. It is possible to get a tactile scroll bar onto either device without harming its utility and while making things much simpler to interact with. Ideally, good use of a keyboard with a wider range of functions would remove entirely the need for popup menus and of buttons on the screen. In a way this harks back to keyboard templates like those for wordperfect. I wonder if now the excitement over GUIs and mice has died down that this approach will turn out to be practical after all.

Update 18 October 2005:
United Keys has a keyboard model that is somewhat less radical in design and looks to be a little closer to market. Thanks to commenter Tobi on the Dutch site Usabilityweb for pointing it out. I like the colour LCD promised in the Optimus keyboard, but suspect that the united keys approach of segregating regular typing keys from programmable function keys will wear better. Ultimately it has to both look good and be a reasonable value proposition to attract users.

Benjamin

Sat, 2005-Jul-16

Solaris C++ Internationalisation

One of my collegues has been tasked with introducing infrastructure to internationalise some of our commercial software. We are currently running Solaris 9 and using the Forte 8 C++ compiler. We decided that the best way to perform the internationalisation would be to use a combination of boost::format and something gettext-like that we put together ourselves. That's where the trouble started.

The first problem was the compiler itself. It can't compile boost::format due to some usage of template metaprogramming techniques that were slightly beyond its reach. It quickly became clear that upgrading to version 10 of the compiler would be necessary, and even then patches are required to build the rest of boost.

That was problem number one, but the hard problem turned out to be in our use of the -libary=stlport4 option. Stlport appears not to support locales under Solaris We've been tracking Forte versions since the pre-standardisation 4.2 compiler, and that's just while I've been working there. We originally used stlport because there was no alternative, but when we did upgrade to a compiler with a (roguewave) STL we found problems changing over to it. When got things building and our applications were fully loaded up with data we found they used twice as much memory as the stlport version. At the time we didn't have an opportunity to upgrade our hardware so that kind of change in memory profile would have really hurt us. With no impetus for change we decided to stick to the tried and true.

By the time Forte hit version 8 it had the -library=stlport4 option to use an inbuilt copy of the software and we stopped using our own controlled version. We found at the time that a number of STL-related problems being reported through sunsolve were being written off with "just use stlport" so weren't keen to try the default STL again. These days it looks like this inbuilt STL hasn't been modified for some years. It does support non-C locales, but moving our software over is a new world of pain.

Another alternative was to use gcc. Shockingly, the 3.4.2 version available from sunfreeware produced incorrect code for us when compiled with -O2 for sparc. This also occured in the latest 3.4.4 version shipped by blastwave. I haven't looked into the problem personally to ensure it isn't something we're doing, but the people who did look into it know what they're doing. Funnily, although the sfw 3.4.2 version did support the full range of locales, blastwave's 3.4.4 did not. We would have been back to square one again.

So, the summary is this: If you want do internationalise C++ code under Solaris today you have very few good choices. You can run gcc, which seems to have some dodgy optimisation code for sparc... but make sure you get it from the right place or it won't work. You can use Forte 10, but you can't use the superior stlport for your standard library. C++ is essentially a dead language these days, so don't count on the situation improving. My guidance would be to drop sparc as quickly as you can, and use gcc on an intel platform where it should be producing consisently good code.

Benjamin

Sat, 2005-Jul-16

REST Content Types

So we have our REST triangle of nouns, verbs, and content types. REST is tipping us towards placing site to site and object to object variation in our nouns. Verbs and content types should be "standard", which means that they shouldn't vary needlessly but that we can support some reasonable levels of variation.

Verbs

If it were only the client and server involved in any exchange, REST verbs could be whittled down to a single "DoIt" operation. Differences between GET, PUT, POST, DELETE, COPY, LOCK or any of the verbs which HTTP in its various forms supports today could be managed in the noun-space instead of the verb space. After all, it's just as easy to create a https://example.com/object/resource/GET resource as it is to create https://example.com/object/resource with a GET verb on it. The server implementation is not going to be overly complicated by either implementation. Likewise, it should be just as easy to supply two hyperlinks to the client as it is to provide a single hyperlink with two verbs. Current HTTP "A" tags are unable to specify which verb to use in a transaction with the href resource. That has lead to tool providers misusing the GET verb to perform user actions. Instead of creating a whole html form, they supply a simple hyperlink. This of course breaks the web, but why is not as straightforward as you may think.

Verbs vs Delegates

Delegates in C# and functions in python give away how useful a single "doIt" verb approach is. In a typical O-O observer pattern you need the observer to inherit from or otherwise match the specification available for a baseclass. When the subject of the pattern changes it looks through its list of observer objects and calls the same function on each one. It quickly becomes clear when we use this pattern that the one function may have to deal with several different scenareos. One observer may be watching several subjects, and it may be important to disambiguate between them. It may be important to name the function in a more observer-centric rather than subject-centric way. Rather than just "changed", the observer might want to call the method "openPopupWindow". Java tries to support this flexibility by making it easy to create inner classes which themselves inherit from Observer and call back your "real" object with the most appropriate function. C# and python don't bother with any of the baseclass nonsense (and the number of keystrokes required to implement them) and supply delegates and callable objects instead. Although Java's way allows for multiple verbs to be associated with each inner object, delegates are more "fun" to work with. Delegates are effectively hyperlinks provided by the observer to the subject that should be followed on change, issuing a "doIt" call on the observer object. Because we're now hyperlinking rather than trying to conceptualise a type hierarchy things turn out to be both simpler and more flexible.

The purpose of verbs

So if not for the server's benefit, and not for the client's benefit, why do we have all of these verbs? The answer for the web of today is caching, but the reasoning can be applied to any intermediatary. When a user does a GET, the cache saves its result away. Other verbs either mark that cache entry dirty or may update the entry in some ways. The cache is a third party to the conversation and should not be required to understand it in too much detail, so we expose the facets of the conversation that are important to the cache as verbs. This principle could apply any time we have a third party involved who's role is to manage the communication efficiently rather than to become involved in it directly.

Server Naivety and Client Omniscience

In a client/server relationship the server can be as naive as it likes. So long as it maintains the basic service contstraints it is designed for, it doesn't care whether operations succeed or fail. It isn't responsible for making the system work. Clients are the ones who do that. Clients follow hyperlinks to their servers, and they do so for a reason. Whenever a client makes a request it already knows the effect its operation should have and what it plans to do with the returned content. To the extent necessary to do its job, the client already knows what kind of document will be returned to it.

A web browser doesn't know which content type it will receive. It may be HTML, or some form of XML, or a JPEG image. It could be anything within reason, and within reason is a precisely definable term in this context. The web browser expects a document that can be presented to its user in a human-readable form, and one that corresponds to one of the standard content types it supports for this purpose. If we take this view of how data is handled and transfer it into a financial setting where only machines are involved, it might read like this: "An account reconciler doesn't know which content type it will receive. It may be ebXML, or some form from of OFX, or an XBRL report. It could be anything with reason, and within reason is a precisely definable term in this content. The reconciler expects a document that can be used to compare its own records to that of a supplier or customer and highlight any discrepencies. The document's content type must correspond to one of the standard content types it supports for this purpose."

REST allows for variations in content type, so long as the client understands how to extract the data out of the returned document and transform it into its own internal representation. Each form must carry sufficient information to construct this representation, or it is not useful for the task and the client must report an error. Different clients may have different internal representations, and the content types must reflect those differences. HTTP supports negotiation of content types to allow for clients with differing supported sets, but when new content types are required to handle different internal data models it is typically time to introduce a new noun as well.

Hyperlinking

So how does the client become this all knowing entity it must be in every transaction it participates? Firstly, it must be configured with a set of starting points or allow them to be entered at runtime. In some applications this may be completely sufficient, and the configuration of the client could refer to all URIs it will ever have to deal with. If that is not the case, it must use its configured and entered URIs to learn more about the world.

The HTML case is simple because its use cases are simple. It has two basic forms of hyperlink: the "A" and the "IMG" tags. When it comes across an "A" it knows that whenever that hyperlink is activated it should look for a whole document to present to its user in a human-readable form. It should replace any current document on the screen. When it comes across "IMG" it knows to go looking for something humean-readable (probably an actual image) and embed it into the content of the document it is currently rendering. It doesn't have to be any more intelligent than that, because that is all the web browser needs to know to get its job done.

More sophisicated processes require more sophisticated hyperlinks. If they're not configured into the program, it must learn about them. You could look at this from one of two perspectives. Either you are extending the configuration of your client by telling it where to look to find further information, or the configuration itself is just another link document. Hyperlinks may be picked up indirectly as well, as the result of POST operations which return "303 See Other". As the omniscient client they must already know what to do when they see this response, just as a web browser knows to chase down that Location: URI and present its content to the user.

There is a danger in all things of introducing needless complexity. We can create new content types until we're blue in the face, but when it comes down to it we need to understand the clients requirements and internal data models. We must convey as much information as our clients require, and have some faith that they know enough to handle their end of the request processing. It's important not to over-explain things, or include a lot of redundant information. The same goes for types of hyperlinks. It may be possible to reduce the complexity of documents that describe relationships between resources by assuming that clients already know what kind of relationship they're looking for and what they can infer from a relationships existence. I think we'll continue to find as we have found in recent times that untyped lists are most of what you want from a linking document, and that using RDF's ability to create new and arbitrary predicates is often overkill. My guide for deciding how much information to include is to think about those men in the middle who are neither client nor server. Think about which ones you'll support and how much they need to know. Don't dumb it down the for the sake of anyone else. Server doesn't care, and Client already knows.

Benjamin

Mon, 2005-Jul-11

Object-Orientation Without Baseclasses

I've been working on the impedence mismatch between REST and Object-Orientation. It's a thorny issue, and I've come across several RESTafarians who believe the concepts don't mesh at all. I think that some approaches taken so far have tried too hard to make a resource into an object or an object into a resource. My approach is to match O-O and RESTful via a less drastic (or more, depending on your point of view) overhall. I want change an object's model of abstraction from a set of functions into a set of resources. I want to drop the concepts of base classes and class hierarchies altogether. I want to talk about aspects of an object rather than its whole type or its constituent functions.

Queues and Stacks

Let's leave aside the network-oriented views of how REST works for now. Let's just think of a client object and a server object. To interact with an O-O queue, a client object might be presented with several functions:

You could also add other useful functions like checking the size of the queue or examining entries other than the first. You can do the same sort of thing with a REST queue. You just need to think in terms of resources instead of functions:

A POST to the insertion point resource creates a new queue entry. A GET to the beginning allows the client to examine its content. A DELETE to the beginning (or is that a POST, too?) clears the current value and replaces it with a new one. You might even be able to get away with a single resource representing both an insertion and an extraction point. Either way, with a minimum of verbs and a maximum of nameable resources it is possible to achieve the same kinds of abstractions as we're used to in O-O. It would be simple in either model to change the implementation to a stack while maintaining the same set of functions and type or of resources for the use of client software.

Representations, not Types

REST pushes the possibilities a little further forward by using only representations of objects rather than objects themselves as parameters to its verbs. This sets the type barrier that we often see in O-O much lower, so a representation on disk can be applied to an object in memory or be used to create a new object. Objects of different types but compatible representations can be applied to each other. Even incompatible objects which share only small aspects of their representations with other objects can be squeezed together so long as they provide that aspect of themselves as a resource. It is conceivable that both the stack and queue implementations share the same representation, making it possible to copy one into the other without the need for explicit conversion.

Modelling in an O-O language, and Object-Orientation Without Baseclasses

My picture for linking the O-O and REST world views within an application calls for the concept of a resource type. This type can be implemented explicitly as an interface or baseclass or might be implemented implicitly depending on your language of choice. It is important that this single definition capture all verbs that you might want to apply to resources. A simple implementation of this type would accept function pointers, or delegates, or callable objects, or whatever your language supports to map directly back into the O-O paradigm. For example, an object representing an XML document might be composed of one explicit resource and one family of resources. The main one is a representation of the whole XML document which can be PUT, and GET to with the usual outcomes. The second I call a family, because it is infinitely variable via a query. It represents an xpath-selectable part of the document which can also be PUT and GET to, as well as POSTed to and DELETEd. These would map to one function per verb per resource on my XML object. Verbs on the main resource just become PUT() and GET(). Verbs on anything in the xpath family become xpathPUT(query), xpathGET(query), xpathPOST(query), and xpathDELETE(query).

When dealing with the XML document object, you could pass the resource representing the whole object around and have code that can produce XML replace the document with a new one. If you passed around resources representing only a specific xpath-selectable part of the document it would only be that part that was replaced. If your xpath selected a text or attribute node within the XML you could pass that resource to anything that produces a string and have the node value replaced or used in further processing. In the end, you have a single self-consistent object that can expose any aspect of itself it chooses as a resource to be operated on by anything it likes.

When you've had type as a constant friend and enemy for so many years the thought of minimising it in this way can be daunting, but python took the first stab at this approach. It did so by dropping the type modelling for variables and parameters but keeping it for objects themselves. It is still possible to see TypeError exceptions raised in Python because you passed in a DOM attribute node instead of a string, yet the two are equivalent for GETs and mutable variants of string are equivalent for most purposes. By dropping the type barrier further I think we'll find that for the most part we don't really need type as often as we think we do.

URI-space

So as we remove and reduce the use of exotic verbs and types from the way objects interact with each other we also need to keep in mind the nouns side of things. It's ok to pass resource pointers around within a program until the first time you want to use the result of a resources GET operation as a resource itself. That's when you need a URI space and a means of creating or finding resource objects via the space.

The simplest approach might still be to use your language's object identifers. Python has its id() function. C++ has its pointers. Java no doubt has the same capabilities, as I'm sure I've seen them in the debugger often enough. object:0xfff3673 might be just the ticket to locate your resource quickly. On the other hand you might want to have names that are able to survive serialisation and process restarts. Whichever way you go, you'll also have to deal with that pesky query case that my XML object above makes use of.

So we've identified the resource objects. We also need a resolver object to find resources. To deal with queries, you need an additional object that will return you a resource object to use when provided a query. With a URI that looks like this: "object:path?query" path will resolve to the resource finder object. When you look up the found resource it will most likely call a function on the object that owns this resource that looks something like: object.GET(query). If path turns out to be hierarchical you may also wish to have objects or a class that represent levels within the hierarchy. In my current python prototype I'm using the objects themselves as steps along the path. A few well-named top-level objects are added to the resolver's namespace and python allows easy navigation down the object hierarchy.

What's left?

Well, obviously you need to find the right fit for the kind of language you're using today to make use of any new approach. For myself I'm reasonably comfortable with applying the model generally to Python, C#, Java, C++, and C. I'm more or less comforable with everything except the POST and DELETE verbs. These verbs deal with resource creation and destruction, but because an object is composed of several resources such actions are bound to result in side-effects on other resources. In the XML example I've been using, it would be simple to POST to a factory resource in order to create our XML file. Once we've done that and the POST had returned us the address of the new main URI, how do we find the xpath URI? Several approaches are possible including,

The latter suggestions seem decreasingly restful, but the earlier ones require further thought about how to implement correctly. I'm a little uncomfortable generally thinking about verbs that alter the lifetime of URI, especially after all of the popular O-O languages on earth have just managed to shrug most explicit lifetime managment out of their objects. Refinement of the meaning clients can extract from these and other verbs will be important going forwards. Coming up with new design patterns and examples of how to model certain types of objects using resources will be equally important.

I see significant potential in the overall REST approach for objects. I think that it could suppliment, or even replace O-O design in many areas and in the longer term. In the mean time there is a lot of growing and development to do. I think that seeing it work inside every-day apps could accelerate development of the underlying ideas and improve adoption when compared to the "it only works on the web" mantra. I think that if it really only works on the web then there is something fundamentally broken about it. If it only works on the web, we should be talking about the more Object-Oriented WS-* for web development instead. We at least know that paradigm works somewhere.

Benjamin

Fri, 2005-Jul-08

REST in an Object-Oriented Language

I have received interesting feedback on my last article from a number of sources. Sometimes it is just as interesting to see responses to what I say and what readers hear rather than what I mean. A Mr "FuManChu" took me to task over my treatement of POST and I responded with a comment to his blog. Mike Dierken is concerned about my advice to reduce the number of content types in play to be more RESTful. He writes:

REST actually drives toward increasing content-types. Not necessarily maximizing, but definitely opening up the space of possibilities. So I would say that REST seeks to push the balance away from verbs and as close to the edge of nouns and content-types.

I would respond firstly with this quote from Roy Fielding:

REST enables intermediate processing by constraining messages to be self-descriptive: interaction is stateless between requests, standard methods and media types are used to indicate semantics and exchange information, and responses explicitly indicate cacheability.

I will go on to fill out what I think is a simple difference in terms shortly.

Mike does make the excellent point that

REST does not have a heirarchical namespace. Neither does HTTP. They only have identifiers that are opaque to the client.

Even though the path space in URIs are explicitly hierarchical with relative links forming an integratal part of the specification, REST is based on hyperlinks. We should not be constructing one URI from another. We should be following hyperlinks instead. This was a weakness in my article first pointed out to me by Ken MacLeod via email.

Mark Baker says, and Henry Story echoes:

Nope, properties are properties, resources are the Beans themselves. Imagine a "Person" Bean which might expose the properties; name, birth date, birth city. Now imagine that Bean as a resource, with its own http URI; invoke GET on it and you receive, say, an XML document

You are right that the person may be a resource in their own right. I would say simply that if you are doing GET and PUT operations on the person object it is a resource. The reason I say that a bean's properties are resources (possibly as well) is that they do support GET (get) and PUT (set) operations in their own right. Depending on your application it may be appropriate to expose just the bean or just its properties as resources. It may be appropriate to expose both or neither. My contention is simply that uniform GET and PUT operations applied to any named entity makes that entity a resource, or at worst a pseudo-resource. They highlight the natural tension that already exists in software between the O-O and REST world views.

Several other people have also commented using their blogs. Thanks to technorati at least some of links have been easy for me to find. Please keep the feedback coming, and I recommend technorati membership to anyone who's blog is not currently being indexed by those kind folk. I've picked up several responses only through my site statistics, and many of these I'll have missed as I can only see the last twenty under my current hosting arrangements.

Content Types

I used the term "content type" deliberately in my previous article to evoke the content types identified in the http header. The most common one is surely text/html. The ubiquitous nature of html has been a major component of the success of the web. Instead of having each web site offer content in its own proprietary format, we could rely on web sites that wanted to serve hypermedia using something approximating html. While there are many variants as content providers have asked more of the format, the basics have remained the same and we've been able to settle on something that at worst displeases everyone equally.

When html is not well-formed, or is not completely understandable it may not render correctly. The human viewer can probably still make out everything they need to, and can drop back to the source if they get particularly desperate. The application/xml content type doesn't have such luxuries. As Sean McGrath noted recently in his blog:

XML is not - repeat NOT - a 'file format'

While it is not a file format, it is a content type. It seems that just when you can't stand to send or receive anything at all out of the ordinary, we get lax about what the actual content is going to be.

I think that in the end it doesn't matter, for the same reason that nouns are more important than content types. I think that type is the least of your worries when a client makes a request of a server. The server could reject the request for a huge variety of reasons. Perhaps a number that forms part of the content is greater than ten. Perhaps only chinese text is accepted. The semantics of what a particular URI accepts with its PUT verb are entirely up to the URI owner, and can only be construed from that name of that specific resource. Making sure you send XML instead of ASN.1 or that your XML matches a specific schema is the least of your worries in executing a successful transaction with the resource.

I spoke last time about pushing the balance of the (noun, verb, content type) triangle away from both verb and content type. Again, this is from the perspective of Object-Orientation's typing system where it is necessary to define conversion operations, the creation of new types requires the matching creation of new functions, and if you don't have an exact match things just won't work and may not even compile. Given that knowledge of content type is not sufficient to ensure acceptance by the resource you're talking to, and that it is always the named resource that performs the test for accpetance, I think that content types are probably irrelevant when it comes to mapping REST into an Object-Oriented environment. "Format" is more important. In my view, when something is irrelevant to the correctness of your program the bar should be set as low as possible.

Mapping REST into an Object-Oriented Environment

As I've stated before, I believe if REST actually works it will work in a regular (non-web) programming environment. Further, I think that if it fails outside the web it has a good chance of failing on the web itself for application development.

So, let's set a few rules for ourselves. What restrictions do we need to put on ourselves to avoid fooling ourselves that we're being restful?

  1. All object APIs should be expressed as resources
  2. All resource access should be via a consistent noun concept
  3. No exotic verbs (functions) should be allowed on resources
  4. Navigating through the resource space except via hyperlinks is illegal

Exceptions can be made for objects involved in the mechanics of the RESTful object system, and when data is not in motion (it is in some object's member variable) it can be in whatever form the object desires.

Nouns

So, what should our nouns look like? Our first thought should be that URIs are like pointers or references. We could make those our nouns. Each pointer could refer to an object of type "Resource" with the appropriate verbs on its surface. This model suffers, somewhat, when we think about exactly what we're going to use and return in the PUT, GET, and POST functions.

The second iteration for me is the KISS approach. Let everything be a string, from noun to content type. After all, a noun could and should be used as content from time to time. There is no reason to make artificial distinctions between the two concepts. In my prototype I've been using a scheme called "object" to refer to object-provided resources:

The resolver of object URIs starts with a set of objects with regestered names at the top level. To assist in the mapping onto a pure O-O model it navigates down the object's hierarchy of properties until it meets one that is prepared to perform the standard verb operations for the remainder of the unconsumed part of the URI path or until it reaches the end of its path. If no handler is found, it handles the verbs on the end node itself. This approach allows any object along a path to take over the handling of the rest of the URI text. In the example above, and object of type XMLResource handles its own path. It looks for an xpath entry immediately below it in the uri and parses any query part as an xpath expression. This kind of thing is profoundly useful. It can be a private hell to actually try and create one resource object for each actual resource.

Verbs

I define the following global operations:

These map to operations on objects of:

Errors are currently reported via exceptions. All operations are currently synchronous, including access to remote services such as http.

Content Type

There is probably a performance hit associated with the "all string, all the time" approach to content types. Especially in cases where you have a complex data structure behind one resource and the same kind of structure behand another it may be a useful shortcut to avoid the conversion to string. I think this should be doable with a minimum of code. Caching of transformations may also be of use if performance becomes a problem. I think that overall the advantages of a REST approach should make these problems irrelevant, but you may not want to code any tight loops with GET and PUT operations inside them any time soon :)

Variables and parameters

Variables in a RESTful application become aliases for content or aliases for URIs. They aren't evil, but they don't work like O-O variables and parameters to today. You can't operate on them except by passing them to other functions, or using them in the set of standard verb operations. Instead of passing an object to a function, you'll end up passing in specific sub-resources of the object. Perhaps you'll even pass in structures that form collections of resources, to help things swing back towards the O-O way of doing things which has served us so well over the years. The main thing will be a switch away from the view of a type being a single monolith, and instead seeing the individual resources or aspects of that type as separate entities from the function's point of view.

Event Handling

I'm currently working with an event handling model based on a GET to one resource followed by a PUT, POST, or DELETE to another. Given the REST approach this technique can actually be quite powerful. It is possible to cause a button press to copy a sample piece of XML into a specific (xpath-located) part of an XML resource. It can copy the entire resource to or from the web or a file. It can update its attributes and text nodes with input from the user. It effectively allows you basic XForms capability. I think it is even simple enough to build IDE and GUI builder tools to create these handlers.

You can interact with other kinds of objects as well. It is possible to express a queue or stack as resources. You can POST to the top of a stack to create new entries. DELETE as necessary to remove them. You can interact with a wide variety of data structures and objects. You can copy that XML document onto an object which accepts an XML representation of that specific format to make make it assume a particular state. From a declarative and very simple world view you can trigger effects in your application that are extremely powerful. The power doesn't come from the simple event handler. Instead it comes from exposing application functionality in a RESTful way. The uniformity of this interface makes it possible for software that was never designed for interoperatbility to interact... and without a base-class in sight.

Benjamin

Tue, 2005-Jul-05

REST versus Object-Orientation (and a little python)

Initial revision: 4 July 2005.
Edit: 5 July 2005. Since I wrote this somewhere between the hours of three and five in the morning I've decided to exercise my right to update its content slightly. I've added a few more points missed in the first publication. Thanks to Ken MacLeod for pointing out weakness in the original version's discussion of global namespaces. Actually, on second reading I've misconstrued his email slightly. His contention is that URIs are more like OIDs (pointers, references, or whatever you happen to call them in your langugage of choice). They are things you can dereference to get to the resource you're after, so aren't comparable to the ideas of a global namespace or of local varaibles. This is an important distinction that I haven't fully considered the consequences of, yet.

I think I'm at the stage where I can now compare REST and Object-Orientation from a practitioner's viewpoint. Now, when most people talk about REST they are referring to its use as a best-practice design guide for the web. When I talk about it I'm speaking from the viewpoint of a developer of modular and distributed software. I speak of web services, and typically not of the use of HTML and web browsers. I speak of software design, not web site design.

Similarities

From my perspective, the concepts of Object Orientation (OO) and REST are comparable. They both seek to identify "things" that correspond to something you can about and interact with as a unit. In OO we call it an object. In REST we call it a resource. Both OO and REST allow some form of abstraction. You can replace one object with another of the same class or of the same type without changing how client code interacts with the object. The type is a common abstraction that represents any suitable object equally well. You can replace one resource with another as well, almost arbitrarily so without changing how clients interact with it. Resources are slightly finer-grained concepts than objects, though, so when you talk about resources acting as an abstraction you usually need to talk about the URI space (the resource namespace) being able to stay the same while the code behind it is completely replaced. Clients can interact with the new server software through its resources in the same way as they interacted with the old server software. The resources presented represent both equally well.

Differences

The main difference in my view is that of focus. Objects focus on type as the set of operations you can perform on a particular object. On the other hand REST says that the set of operations you can perform on resources should be almost completely uniform. Instead of defining new operations, REST emphasises the creation and naming of new resources. As well as limiting verbs, REST seeks to reduce the number of content types in play. You can picture REST as a triangle with its three vertices labelled "nouns", "verbs", and "content types". REST seeks to push the balance well away from both verbs and content types, as close as possible to the nouns vertex. Object orientation is balanced somewhere between verbs (functions) and content types (types, and differing parameter lists). I suspect as we understand and exercise the extremes of this triangle over time we'll learn more about where to put the balance for a particular problem space.

In OO object names are always relative to the current object or to the global namespace. We usually see all access restricted to this->something, param->something or SomeSingleton->something, where something is often a verb. It's hard to navigate more deeply than this level because the way Object-Orientation maintains its abstraction is to hide knowledge of these other objects from you. Instead, OO design would normally provide a function for you to call that may refer to the state or call functions on its own child objects.

REST says that the namespace should be king. Every object that should be contactable by another object should have a name. Not just any name, but a globally-accessable one. If you push this to the extreme, every object that should be accessable from another object should be also accessable from any place in the world by a single globally-unique identifier:

in principle, every object that someone might validly want or need to cite should have an unambiguous address.

-- Douglas Engelbart[1]

In his 1991 design document on naming, Berners-Lee wrote[1]:

This is probably the most crucial aspect of design and standardization in an open hypertext system. It concerns the syntax of a name by which a document or part of a document (an anchor) is referenced from anywhere else in the world.

REST provides each abstraction through its heirarchical namespace rather than trying to hide the namespace. Since all accessable objects participate in this single interface, the line between those objects blurs. Object-Orientation is fixed to the concept of one object behind one abstraction, but REST allows us to decouple knowlede even about which object is providing the services we request. You can see the desire to achieve something of this kind via the facade design pattern. REST is focused around achieving a facade pattern; a kind of mega-object.

History

The history I am about to describe is hearsay, and probably reflects more closely how I came to certain concepts rather than how they emerged chronologically. You can track object orientation's history back to best practice for structured programming. In structured programming you think a lot about while loops and for loops. You break a problem down by thinking about the steps involved in executing a solution. In those days it was hard to manage data structures, because it often meant keeping many different parts of the code-base that operated on your data structures in sync. A linked-list implementation often had its insert operation coded several times, so when you changed from a singly-linked list to a doubly-linked one it could be difficult to make sure all of your code still worked correctly. This need for abstraction led to the notion of Abstract Data Types (ADTs).

ADTs were a great hit. By defining the set of legal operations on a data structure and keeping all of the code in once place you could reduce your maintenence costs and manage complexity. The ADT became an abstraction that could represent differnt implementations equally- well. The advantages were so important that the underlying implementation details such as member variables were hidden from client code. Avoiding lapses in programming discipline was a big focus.

Object-Orientation came about when we said "This works so well, why not apply it to other concepts?". Instead of applying the technique just to data structures, we found we could apply it whenever we needed an abstraction. We could apply it to algorithms. To abstract conceptual whosewhatsits. We developed Design Patterns to help explain to each other how to use objects to solve problems that still let us keep our privacy and abstractions.

And REST's history?

REST has an obvious history on the web, where abstraction is a fundamental concept. Resources across the web operate through protocols and other mechanisms that force us to hide the implementation of one object from that of another. I think the seeds of REST in "pure" software are there as well.

Let's take Java beans. The properties of a bean are as follows:

  1. Every java bean class should implement java.io.Serializable interface
  2. It should have non parametric constructor
  3. Its properties should be accessed using get and set methods
  4. It should contain the required event handling methods

I see this as a significant step from an Object-Oriented model towards that of REST. I'll leave aside the implementation of serialisable that allows a representation of the object to be stored, transmitted, and unpacked at the other end. I'll also leave aside the default constructor that must be present to make this sort of thing happen as it should. The real meat in my pie is the use of properties, or as I would call them: Resources.

The use of properties in otherwise object-oriented design paradigms is flourishing. It is much simpler to deal declaritavely with a set of ("real" or otherwise) properties than it is to deal with function calls. Graphical editors find it easier to deal with these objects, and I suspect that humans do as well. By increasing the namespace presence of the object and trimming down the set of operations that can be performed on each precense in that namespace we see that it is easier to deal with them overall. We don't lose any abstraction that we gained by moving to ADTs in the first place because now these properties aren't revealing the internal implementation of our object. They're forming part of the interface to it. When we set these properties or get them, code is still churning away behind the scenes to do whatever the object chooses. The set of properties can represent different types of object equally well.

This is a tipping of the triangle away from verbs and content types towards nouns. I'm sure you can think of other examples. Tim Bray writes, referring to a battle between ASN.1 and XML:

it seems to be more important to know what something is called than what data type it is. This result is not obvious from first principles, and has to count as something of a surprise in the big picture.

Integrated Development Environments

IDEs have recently started to become good enough to use. That's strong praise coming from me, a VI man. Havoc Pennington recently wrote:

The most important language features are the ones that enable a great IDE.

So-called "dynamic" languages such as Python fall short on this mark, because it just ain't possible to for an IDE to examine your program an infer any but the most basic information from it. Python tries to come up with a less formal way of handling type, by essentially saying "if you don't use it, I won't check it". It's still strongly-typed under the covers, though. There are still basic expectations applied to objects you pass in to a function. A string, XML attribute node, and integer aren't all interchangable even if they are parsable to the same number. You just can't know what is expected programatically. Python tips the triangle slightly towards nouns with its intrinsic support for properties, but does not attempt to reduce the number of content types in play.

I think that a more RESTful approach could make things better. By explicitly restricting yourself to dealing with properties rather than function calls, you could create a namespace that represents the entire accessable functionality of your program. As I hinted at earlier, GUI toolkits are alreadying heading the properties way. Once you have as much as possible of your functionality exposed through properties instead of regular function calls, it becomes possible to expose a single namespace that encapsultes this functionality. An IDE could easily work with such a namespace to allow automatic completion, and the thing we all really miss in python: Earlier checking. If you use hyperlinking (complete names rather than constructed names) as much as possible, you can get these hyperlinks checked at construction time if you like. You don't have to wait until you dereference them. To my mind, the simple string should be the main common currency in this world. With garbage collection in play, immutable strings are cheap to pass around and handle. They only need to be processed when they are actually used for something.

How do we make our software more RESTful?

Try exposing your functions as properties or sub-objects instead of functions as you think of them now. Expose a simple set of functions on these properties. Good REST pratice currently says "GET", "PUT", and "DELETE" are most of what you need. You should use "POST" as a "create and PUT" operation. Try giving all such resources globally-accessible names rather than boxing them up. The theory is that other objects will only access them if you hand them a hyperlink (the whole name), so privacy isn't an problem. Use types that are as simple and uniform as possible. I've been trying to get away with just a combination of strings and DOM nodes, although I'm not convinced the latter is a perfect fit.

Any code that accesses this name, verb, and content-type space operate simply using hyperlinks. You may choose to open up access to other URI spaces, such as the http and file schemes. In this way you can hyperlink to and from these spaces without altering the meaning or means of constructing your client code.

To be honest I don't really know whether this will work in the large, yet. I'm trying it out on a program that tries to work as a souped-up glade at present but I have a lot of thinking still to expend on it. I haven't covered the REST approach to event handling in this article, which I think is probably about as hard as it is to describe event handling in any design paradigm. Perhaps another time.

Benjamin

  1. Dan Connolly, "Untangle URIs, URLs, and URNs"