Sound advice - blog

Tales from the homeworld

My current feeds

Wed, 2009-May-27

Specification for an REST Asynchronous Request class

What should a client implementation of a REST interface look like in order to take full advantage of the evolutionary mechanisms built into REST directly and more broadly into the Web? More importantly, how should clients be written to best benefit the architecture as a whole and permit gradual evolution of both media types and resource identifiers? I thought I would put a few words together to specify what this should all look like.

A Single Asychronous Request

I'm going to position this specification in terms of an asychronous request that a client wishes to issue to a server. If you prefer, you can think of this as a single asynchronous request that a service consumer issues to a service. From a SOA perspective some base part of the URL (often the authority of the URL) will identify the service. Anything past that point is a fine-grained identifier for a resource the service owns, and each resource presents a uniform interface consisting of media types only from the centralised schema inventory and methods only from the centralised method inventory.

I am framing this in terms of a class that models a single request because I want to be clear what should generally be a responsibility of calling code versus what should be handled automatically and efficiently by the REST client framework. This single request should be essentially as easy to invoke as any capability of a SOA service, and just as easy to understand despite the magic.

Objectives

The main objectives of this class are to correctly handle retries, redirection, and content negotiation. Data folding is also a potential bonus. Retries allow for reliably issuing requests, but are only appropriate for idempotent and safe requests. Redirection allows the set of resource identifiers in the architecture to evolve over time as services are upgraded without requiring client upgrade. Content negotiation allows services and their consumers to be upgraded independently without breaking compatibility with each other as particular media types are deprecated and eventually phased out over the lifetime of the architecture.

These are all features designed to support run-time discovery of resources (known as hyperlinking) and therefore run-time discovery of services by clients. This discoverability is designed to continue working over the lifetime of the architecture without requiring mass-redeployment or mass-reconfiguration of components. Each component continues to do its job, following and directing clients to follow links as required, and stating its own capabilities to the extent necessary for components around them both old and new to continue interoperating with them.

A Simple Class

Let's assume that we have a HTTP implementation that is able to make individual requests on our behalf. I say HTTP not because it is the only RESTful protocol, but because it is a common and reasonably exemplary one. It incorporates many features from Roy's specification alongside a few Web-specific features.

The class we will look at initially covers most of our objectives. It will deal with retry, redirection and data folding. I will leave content negotiation for an advanced version of the class.

Simple AsyncRequest class

This at first looks simple: As a client object, you construct an AsyncRequest with the URL of your resource. Once constructed you invoke a request with the required method, plus an optional media type and content to send with the message. Media type and content can both be treated as strings (or byte arrays) and would be the body of a HTTP request. Method is probably a simple string, but more complex requests might require the inclusion of additional header information.

When the response is received, AsyncRequest will invoke a callback on the client. Code is probably a numeric HTTP code, however it should at least be easily convertible into an easy success/fail status and as with method may have associated headers. You might use a code class that can return this information from a function for easy synthesis while retaining the ability to log full response details for analysis in the case of failure. Media type and content are again strings or byte arrays, and have the same semantics as they do in requests.

Behaviour

The AsyncRequest class will attempt to make its request to the service over a TCP transport. Failures in the transport can be modelled as HTTP response codes. Failure to connect can be modelled as 503 Service Unavailable, as the request is known not to have been processed. Failure after connection can be modelled as 504 Gateway Timeout, where it cannot be discerned whether the request was processed or not.

On initial construction, myURL and myConfiguredURL are both set to the specified URL. myProxy is cleared unless an explicit proxy is configured. The myURL value normally determines which DNS name or IP address to connect to at the TCP level. However, this is overridden by myProxy if it is present. The myURL value is sent as part of each request, including retries. myURL and myProxy may be modified by temporary redirection codes, while myConfiguredURL is only modified in the case of permanent redirection. At the end of any failed or successful request myURL is set back to myConfiguredURL and myProxy is either cleared or returned to a configured value.

Various HTTP response demand different action. These are return (success or fail), retry, modify and retry, sleep and retry.

CodeAction
100 ContinueContinue request - only returned if Expect: Continue was in request
101 Switching ProtocolsContinue request - only returned if an Upgrade header was in request
1xx Other InformationalReturn, failure
200 OKReturn, success
201 CreatedReturn, success (include location header)
202 AcceptedReturn, success (so far)
2xx Other SuccessfulReturn, success
300 Multiple ChoicesReturn, failure (no mechanism exists to support this code)
301 Moved PermanentlyModify myURL and myConfiguredURL to match Location header and retry
302 FoundModify myURL only to match Location header and retry
303 See OtherModify myURL only to match Location header, set myMethod to GET, and retry
304 Not ModifiedReturn, success
305 Use ProxyModify myProxy only to match Location header and retry
307 Temporary RedirectModify myURL only to match Location header and retry
3xx Other SuccessfulReturn, failure
400 Bad RequestReturn, failure
401 UnauthorisedReturn, failure (you should almost certainly be using SSL if authentication is important, although use of a special class to handle challenges could be implemented)
404 Not FoundReturn, success if request was DELETE otherwise failure
408 Request TimeoutRetry
410 GoneReturn, success if request was DELETE otherwise failure
416 Requested Range Not SatisfiableModify Range header according to Content-Range header and retry
417 Expectation FailedDrop expectation if possible and retry, otherwise return failure
4xx Other Client ErrorReturn, failure
500 Internal Server ErrorReturn, failure
503 Service UnavailableSleep for indicated time and retry
504 Gateway TimeoutRetry if request is safe (GET) or idempotent (PUT or DELETE). Otherwise, return failure.
5xx Other Server ErrorReturn, failure

The handling of these codes requires some rewriting and retrying of requests, and also potential sleeps. However, this can be handled transparently by the AsyncRequest class for the most part and the client does not have to be concerned. This can all happen behind the scenes, and therefore consistently across different clients and services to benefit the overall flexibility and evolution of the architecture.

Data folding is the concept that it is beneficial to miss intermediate states in favour of a correct current state. Data folding for GET requests relies somewhat on the class using this request, still. They would generally start the first request, then if they decide they need another request would note this fact for themselves while waiting for the response. If yet another need to send a request popped up while the first is outstanding, it does not need to be noted. The client will send its queued GET request as soon as the current one returns. We generally don't want to cancel a GET request in this context in case we get ourselves in an infernal loop of continuously cancelling requests while they are still outstanding.

Data folding also applies to PUT and DELETE requests, each of which is designed to completely replace the effect of the previous request on the same resource. As such, if we have a PUT or DELETE request in progress and another comes in we can simply queue the latest of these requests for a given URL. This allows us to convey our latest intent for the state of a resource without unnecessary delay attributable to intermediate states.

Through both of these data folding techniques we are removing unnecessary queuing within the architecture that can lead to increased system latency and eventual system meltdown. In fact, simply through the use of an object to model the request state we can easily keep track of how many of these objects and their corresponding requests we currently have outstanding and keep this queue under control as well. Data folding support can be wrapped up in its own class for the convenience of clients, or possibly even into the main request class.

Adding content negotiation

Advanced AsyncRequest class, supporting content negotation

In order to correctly support content negotiation we must give over our content to the AsyncRequest class in an encoding-neutral form. I have shown this as an abstract class in the above diagram called "Encoder". What this does is allow request content such as that from a PUT to be encoded as several possible media types. A default is typically selected and the content encoded in that form for transmission. However, if this type is not acceptable to the server we will hopefully get back enough information to retry the encoding with an acceptable type. An example where this kind of thing would be useful would be for an architecture transitioning from RSS to atom news feeds. An upgraded client may try to PUT an atom article to its server, but the server only accepts RSS. The server rejects the initial PUT request (perhaps with some expect-continue going on behind the scenes) and informs us (hopefully through an Accept header in the response or similar) that it does support rss. We ask our encoder to format the document in the legacy RSS form, and can continue our operation to a success state.

On the return side we have a Parser class to interpret responses. Its set of acceptable types is first interrogated as we send a request, and this information included for transmission to the server. A correctly-implemented server will return us a document in one of the acceptable types, and that document will be passed through the Parser on its way back to the advanced client. The client cares only about the information contained in the document, not in the encoding format. Therefore, the content is parsed into a common data structure appropriate for Client processing.

This picture is a little more complicated than the simple picture described previously, and could be simplified slightly. For example, the Client could incorporate Parser and pass its list of acceptable types directly into the AsyncRequest request method. However even in this structure it is likely that you would want to separate out the parser code so that it could be used by multiple clients that share the same required data structure. This is particularly the case for very simple types such as numbers and strings that may be able to be easily extracted from a number of different media types.

Conclusion

The important features of HTTP in support of REST architectural constraints and objectives should be implemented consistently across all HTTP clients. The architecture as a whole suffers if they are missing or difficult to use as evolution requires simultaneous upgrade of multiple components. Supporting redirection and content negotiation at the very least, plus sleeping when a service is under load and observing other responses means that we are more flexible in how we can modify and operate our systems both large and small.

A key to success in this area is to make the client implementation as simple as possible, and really zero additional effort to support these features. A good interface design in this area can lead to better architectural outcomes.

Benjamin

Tue, 2009-May-05

The text/plain Semantic Web

Perhaps the most important media type in an enterprise-scale or world-scale or architecture is text/plain. The text/plain type is essentially schema free, and allows a representation to be retrieved or PUT with little to no jargon or domain-specific knowledge required by server or client. It is applicable to a wide range of problems and contexts, and is easily consumed by tools and humans alike.

Uses of text/plain

In essence, this type conveys a string. However, we can also think about embedding numbers or other simple data types. The modern dynamic language approach to looking at strings is to allow implicit conversion between the information inserted by the sender and the type expected by the consumer. These values can easily be incorporated into programming language data types, inserted into databases, spreadsheets, reports, or other structures.

To outline a few potential uses of text/plain, consider the following interactions

Standards and compatibility

While formatting of numbers and other types may seem natural enough, it is important that this be done consistently if the information is to remain legible when it is processed. To my mind the best resource in formatting and processing of simple text-compatible data types can be found in the specification for . Part 2 contains a section on built-in datatypes that covers a range of string, numeric, URI, date and time, and other simple types. Any data that can be formatted according to the rules in this section absolutely should be.

However, this leads to a dilemma. What do we do with types that are not found in this set? Should a geo-location become a structured XML document, or should it too be coded as text/plain? rfc2426 defines a semi-colon-separated standard format for geo-location, which could certainly be coded as text/plain. However, it is not clear at this stage that this is or will be the canonical way of encoding this information as a text/plain document. Without reference to applicable and universal standards we bear a significant risk that the partially-formatted content we transfer will in fact not be understood.

Applicability of text/plain MIME type

Part of the problem that emerges is that text/plain is not specific enough. It doesn't have sub-types that are clearly tied to a specification document or standards body. This makes interoperability a potential nightmare of heuristic detection.

Unfortunately, while XSD provides an excellent catalogue of basic types it is neither comprehensive nor sufficiently connected to MIME usage. Another problem with using text/plain in its bare form is its default assumption of a US-ASCII character type. This can lead to obvious problems in a modern internationalised world.

Without being backed by some kind of standards body, the advice I give in this regard is merely that. Standards may emerge later that contradict what I have to say here. That said, my advice is this:

  1. Treat text/plain content as being formatted according to XSD conventions when you recieve it. Take care to process character encoding directives correctly and support at least a utf-8 encoding.
  2. Consider using a text/xsd+plain document type when transmitting XSD-formatted simple content. This will hopefully indicate that the document can be understood as text/plain, but provide additional context if more complex processing is applied to the document.
  3. Make use of other specialised types that indicate the standard being applied when types outside of the XSD set are employed. For example, the geo coordinates above might be described as text/vcard+plain.

Again, ideally we would be making use of a well-defined standards body to own and maintain the media types used to communicate very basic information. Making up your own can only take the state of the art so far. However, standards sometimes emerge out of common best practice... so it is not a complete waste of time to be heading down this particular path.

When not to use text/plain

It should be clear that text/plain is not a tool for every occasion. It is often important to sample or send an atomic set of data that would require additional schema. Plain text when overused can lead to performance problems as individual values are sampled one by one instead of as a consistent and coherent document.

Perhaps the clearest indication that you are overusing text/plain is that you are experiencing an explosion in hyperlinks. When you start to need a document to provide links for consumers to find these text/plain-centric resources, you should probably consider incorporating the information directly into these documents themselves.

Used appropriately to transfer information to and from well-known and stable resources, text/plain or its variants can be an efficient way to communicate simple data without introducing unnecessary jargon. The URI of the resource and the implementation of client and server will provide sufficient context to format and process these simple data types.

The low barrier to entry to these types makes them universally applicable and easy to work with, however the lack of standardisation around matching encodings to media types is an inhibitor to their potential uptake. Used well, especially in combination with link headers and/or text/uri-list these types can provide an effective to way to make your protocols get out of the way of communication and let clients and servers interoperate with minimal complexity for simple use cases.

Benjamin