Sound advice - blog

Tales from the homeworld

My current feeds

Mon, 2009-Aug-31

WADL for REST-style SOA

In the words of Mark Nottingham:

CORBA has IDL; SOAP has WSDL. Developers often ask for similar capabilites for HTTP-based, "RESTful" services. While WSDL does claim support for HTTP, isn't well-positioned to take advantage of HTTP's features, nor to encourage good practice.

is designed to fill this gap. It is HTTP-centric and attempts to provide a straightforward syntax for describing the semantics of particular methods on resources.

Resources and SOA

Resources are a key concept in , and also in REST-style SOA. A service expresses its interface as a set of resources. All resources share the same Uniform Contract. However, different resources have different associated semantics.

Resources effectively replace the traditional service-specific contract in a . In doing so they introduce a meta-data gap. Where the contract previously described the interface to the service in a single coherent place, the Uniform Contract of resources does little to describe the interface of a given service.

Filling this gap is an important part of applying the REST style to SOA. This occurs in two parts: One part is the application of additional hypermedia so that clients can locate the correct resource to invoke requests upon based on the information that they are likely to have at hand. This hypermedia is incorporated into the Uniform Contract in terms of defined link relationship types and dedicated hypermedia-intensive media types. The second part is the context where WADL can be applied, and is incorporated into the service description.

Objectives of WADL in SOA

The kind of machine-readable description WADL could offer is required to fulfil a number of specific needs:

The information is not fundamentally for service consumer consumption beyond a basic level of discovery. Importantly, knowledge of relationships between resources should not normally be known by service consumers ahead of time. Consumers should make use of links between resources to discover relationships. Failure to adhere to this principle undoes a number of REST features such as the ability to link freely from one service to another with confidence that service consumers will successfully discover and exploit the specified resource at runtime.

Key Resource Metadata

I tend to look on the kind of interface description needed at this level as more of a table than a tree. I think that it is generally advisable to include most of the path when describing the semantics of methods on a given resource. Nevertheless, XML is adequate to describe this structure. I would tend to include the following features:

An example description in table format might be:

Base URLhttp://invoice.example.com
Resource Identification Method (Uniform) Media Types (Uniform) Cache Documentation
/{invoice id} GET
  • application/vnd.com.example..invoice+xml
  • application/vnd.com.visa..invoice+xml
Must revalidate Retrieve invoice for invoice id
PUT
  • application/vnd.com.example..invoice+xml
No cache Update Invoice for invoice id to match specified state
/{invoice id}/payment/ POST
  • application/ebxml.transaction+xml
No cache Add a payment relating to this invoice
/{invoice id}/payment/{trans id} GET
  • application/ebxml.transaction+xml
5 minutes Fetch a specific transaction for this invoice
/?[date=]&[customer=]&[paid=] GET
  • application/vnd.com.example.invoice+xml
Must-revalidate Query for invoices with specified properties

Each line describes a set of resources corresponding to the URL template. The template is filled out with parameters that the server will interpret when the request is processed. URLs can be seen as a message from the service to itself. It should not generally be parsed outside of the service, nor constructed outside of the service. I tend to use the query convention in the last URL template to indicate this rule is being broken and explicit service/consumer coupling is being introduced.

Methods and Media Types are referred to by their identifier only. There is no need to include them in a service description, because they should already be adequately described in uniform contract specification documents. Supporting multiple media types for a given method is important in an architecture with an evolving set of media types. It allows old services and consumers to continue to interact with new ones over the transition period without having to perform simultaneous upgrade.

Cache is also important, as this is a key REST constraint that needs to be described to support governance activities in compliance with the constraint.

While nesting exists to a point, there is no strongly-implied relationship between different nested or non-nested URLs. Each resource has its own distinct semantics. Hypermedia is incorporated into multiple resources by way of links embedded in invoice representations. For example, invoices are likely to include a link to the customer entity that the invoice was made out to. The set above also includes hypermedia in the form of query URLs that a service consumer who has a number of starting parameters can use to find the invoices they want.

Applying WADL to the problem

The WADL equivalent for the above service metadata is follows:

<application
	xmlns="http://research.sun.com/wadl/2006/10"
	xmlns:html="http://www.w3.org/1999/xhtml"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
	>
	<doc>
	<html:p>Meta-data for the Invoice service, corresponding to the
	<html:a href="http://example.com/interface-control/resource">
		example.com uniform interface</html:a>
	</html:p>
	</doc>
	<resources base="http://invoice.example.com">
		<resource path="/{invoice-id}">
			<param name="invoice-id" style="template" type="xsd:NMTOKEN"/>
			<method name="GET">
				<doc><html:p>Retrieve invoice for invoice-id</html:p></doc>
				<response>
					<representation mediaType="application/vnd.com.example..invoice+xml"/>
					<representation mediaType="application/vnd.com.visa..invoice+xml"/>
				</response>
			</method>
			<method name="PUT">
				<doc><html:p>Update Invoice for invoice id to match specified state</html:p></doc>
				<response>
					<representation mediaType="application/vnd.com.example..invoice+xml"/>
				</response>
			</method>
		</resource>
		<resource path="/{invoice-id}/payment/">
			<param name="invoice-id" style="template" type="xsd:NMTOKEN"/>
			<method name="POST">
				<doc><html:p>Add a payment relating to this invoice</html:p></doc>
				<response>
					<representation mediaType="application/ebxml.transaction+xml"/>
				</response>
			</method>
		</resource>
		<resource path="/{invoice-id}/payment/{trans-id}">
			<param name="invoice-id" style="template" type="xsd:NMTOKEN"/>
			<param name="trans-id" style="template" type="xsd:NMTOKEN"/>
			<method name="GET">
				<doc><html:p>Fetch a specific transaction for this invoice</html:p></doc>
				<response>
					<representation mediaType="application/ebxml.transaction+xml"/>
				</response>
			</method>
		</resource>
		<resource path="/">
			<param name="date"
				style="query" required="false" type="xsd:dateTime"
				/>
			<param name="customer"
				style="query" required="false" type="xsd:anyURI"
				/>
			<param name="paid"
				style="query" required="false" type="xsd:boolean"
				/>
			<method name="GET">
				<doc><html:p>Query for invoices with specified properties</html:p></doc>
				<response>
					<representation mediaType="application/ebxml.transaction+xml"/>
				</response>
			</method>
		</resource>
	</resources>
</application>

That actually wasn't too painful. It was easy enough to mimic the table structure. I think this makes the description of a resource more readable. It handled the various parameters to these URLs in a straightforward way. The only thing really missing from this description is the caching information from our earlier table.

Conclusion

I think WADL is more or less suitable as a machine-readable media type for describing the set or resources exposed by a service. It could perhaps do with some extensions (and better extensibility), but it seems like a good starting point to me.

I have written about WADL from a slightly different perspective previously.

Benjamin

Sun, 2009-Aug-16

MIME types holding REST back

With the increasing focus on within enterprise circles, the practice of how REST gets done is becoming more important to get right outside of the context of the Web. A big part of this is the choice of application protocol to use, the "Uniform Contract" exposed by the resources in a given architecture. Part of this problem is simple familiarisation. Existing enterprise tooling is built around conventional RPC mechanisms, or layered on top of HTTP in SOAP messages. However, another part is a basic functional problem with HTTP that has not yet been solved by the standards community.

HTTP/1.1 is a great REST protocol. GET and PUT support content negotiation and redirection. They are stateless, and easy to keep stateless. They support layering. GET fits extremely well into caching infrastructure. These methods fit into effective communication patterns that solve the majority of simple distributed computing communication problems. HTTP works well with the URI specification, which remains best practice for identifying resources. HTTP also accommodates extension methods in support of its own evolution, and in support of specialisations that may be required in specific enterprise architectures or service inventories.

A significant weakness of HTTP in my view is its dependence on the standard for and on the related iana registry. This registry is a limited bottleneck that does not have the capacity to deal with the media type definition requirements of individual enterprises or domains. Machine-centric environments rely in a higher level of semantics than the human-centric environment of the Web. In order for machines to effectively exploit information, every unique schema of information needs to be standardised in a media type and for those media types to be individually identified. The number of media types grows as machines become more dominant in a distributed computing environment and as the number of distinct environments increases.

Media type identification is used in messages to support content negotiation and appropriate parser or processor selection. At the scale of the Web, only a small number of very general types can be accommodated. It is difficult to obtain universal consensus around concepts unless the concepts themselves are universal and agreeable. Smaller contexts, however, are able to support a higher degree of jargon in their communication. An individual enterprise, a particular supply chain, a particular problem domain is capable of supporting an increased number of media types over and above the base set provided by the Web. The ability to experimentally define and evolve these standards outside the Web is essential to a healthy adoption of the REST style and related styles beyond the Web.

An example of the capability to perform media type negotiation with HTTP can be found in the upgrade from RSS to ATOM feeds. While the human element of this upgrade rarely required this in practice, HTTP makes it possible for a client to state which of these types it supports. The server is then able to respond with content in a form the client understands. In a machine-centric environment, this is even more important. Few content types used in the early days of most architectures will survive into maturity. Types will change and evolve, and many will be superseded. Machine-centric environments do not have the same capability to change URLs based on their upgrades, so content negotiation based on media type allows incremental upgrade of a system... one component at a time.

A URL-based Alternative

The main alternative for media type identification would be to use . These already provide a decentralised registry, and can double at the URL of the related human-readable specification document. These seem like a simple answer. Existing IANA types can be grandfathered into the scheme with a http://www.iana.org/assignments/media-types/ prefix, which would already point as a URL to an appropriate specification document for a number of types.

However, URNs suffer three problems when compared to MIME identifiers. The first is simply that HTTP does not permit their use in the appropriate headers. The second is that URNs cannot be further interpreted when they are read. The third is that URNs cannot be parameterised as MIME types are.

MIME types are capable of identifying not only their specific type, but additional types they inherit from or are interpretable as. For example, most XML media types include an extension "+xml". This allows generic processors of XML to interpret the content based on broad-brushed generic mechanisms. One could extend this concept to support specialisations of higher-level media types such as atom. Storing a specific structure of data within an atom envelope does not prevent it from being interpreted as atom. Leaving this information in place within the media type identifier gives generic processors additional visibility into the messages being exchanged.

The use of parameters on media types would essentially not be possible within the URL itself. These could be included in syntax around the media type if they are still required. Typically, XML and binary types should no longer require these headers... so this may be of historical and plain text importance only. Plain text types will often be able to use different HTTP fields to convey relevant information such as their content-encoding.

The solution to these problems could be a revision of HTTP to include URI syntax support for media types, combined with a protocol whereby processors could determine whether one media type inherits from another. Whether HTTP can be revised is a difficult question to answer, but a protocol for discovering inheritance relationships is relatively easy to develop. One could either make use of HTTP headers in the GET response for the URI, or specify a new media type for media type description. The obvious approach with link headers would be to say Link: rel="inherits". However, this is a limited approach. An actual media type description media type could take the form of a short XML document or simple microformat for human-readability, and is perhaps more general and future-proof.

Specific changes that would have to occur in HTTP would apply to the Content-Type and Accept headers. Content-Type would be relatively easy to extend to support the syntax, however problems may emerge in any deviation from the definition of MIME itself and the use of this header within SMTP and other contexts. Accept, as well, would be relatively easy to extend. Quote characters such as " (&quote;) would need to be included around URLs to avoid confusion when ";" (semicolon) and "," (comma) characters are reached during parsing. This may impact backwards-compatibility.

Backwards-compatibility is a prime concern for HTTP. It may be worth doing some trials with common proxies and intermediaries to see how Content-Type and Accept header changes would impact them, to see just how big this problem would be in practice.

A decentralised registry in MIME syntax

An alternative to going all the way to URNs might be to make use of the DNS system alone as part of a media type identifier. For example, a "dns" sub-tree could be registered under MIME. The sub-tree would allow an example organization to safely maintain a set of identifiers beginning with application/dns.com.example without IANA coordination. Any organization with a presence in the DNS tree could do the same.

The main upside in this approach is that consistent syntax could be maintained for new media type identifiers. HTTP could be used as-is, and individual organizations could create, develop, and market their standards as they see fit. The "+" syntax could continue to be used for inheritance, so we might still end up with hybrid types such as application/dns.com.example.record+atom+xml. If this got out of hand we could be talking about fairly long media types, but hopefully the social pressures around reasonable media type identification would work against that outcome.

Perhaps the strongest argument against this alternative is a loss of discovery of standards documentation. URLs can easily be dereferenced to locate a specification document. This hybrid of DNS and MIME would need additional help to make that so. It would be necessary to have a means of translating such a MIME identifier into a suitable URL, which quickly leads into the world of robots.txt and other URL construction. While this is not a desirable outcome, at least it doesn't leave the lookup of a URL as an integral part of the parsing process. The URL-based solution may do that.

As a strawman, one might suggest that any MIME type registered in this way would be required to have something human-readable at a /mime path under the DNS name: e.g. application/dns.com.example.record would become the http://record.example.com/mime URL. This would be quite awkward.

A Hybrid Approach

A third alternative would be to define a way to encode URLs or a subset of possible URLs into media type identifiers. In this example the IANA subtree might be "uri" instead of "dns". The type name would have to be constructed so that the end of the dns part of the identifier could be found and further "." characters treated as URL path delimiters. For example, application/uri.com.example..record+atom+xml could indicate that the type can be interpreted as both xml and atom+xml. In addition, the specific specification for this variant of atom+xml can be found at <http://example.com/record>.

Specification Persistence

All decentralised options share possible persistence problem. We can probably trust the IETF to hold onto standards document for historical reference for a few hundred years. Can we trust a small start-up business to do the same? What happens when the domain name of an important standard is picked up by domain squatters? Most standards should still not be registered under an individual company name once they reach a certain level of importance. They should fall under a reputable standards body with a fairly certain future ahead of them.

Conclusion

I'm torn on this one. I don't want to go to the IANA every time I publish a media type within my enterprise. I like URLs, but want a straightforward way to discover important inheritance relationships. I don't want to break backwards-compatibility with HTTP, and there is no better protocol available for the business of doing REST. What's a boy to do?

My preference is to go with a hybrid approach. This would yield compatible syntax with today's HTTP, but still support a highly decentralised approach. Over time, a significant number of standards should climb the ladder from early experimental standard through to enterprise and industry standards to eventually become part of a universally-acceptable set that describe the semantic scope of the machine-readable Web. Their identifiers may change as they make the transition from rung to rung so that the most important standards are always under the namespace of the most well-respected standards bodies.

Benjamin

Wed, 2009-Jul-15

A brief introduction to Systems-style Functional Analysis

Software Engineers can easily get out of touch with the work of other engineering disciplines, and I think there are some things we tend to miss out on. Systems Engineering in particular is a discipline with significant synergies with Software Engineering on the large scale, and dovetails into a number of other more specialised disciplines.

Systems Engineering is a somewhat confused term in the trenches. To some software engineers the term is taken to mean "hardware engineer". If it isn't software, it must be hardware... so Systems Engineers are the guys who decide which servers to buy and how to wire the racks together. To others, systems engineers are simply the guys who write the requirements specification. To others again, the term is referring to people who integrate generic software products into turn-key solutions.. ie the guys who write configuration files.

When I talk about Systems Engineering I am referring to the MIL-spec concept: Someone who decomposes large complex systems into smaller systems, subsystems, and configuration items based on a functional analysis and a range of other techniques. The functional analysis approach used by Systems Engineers also has ties to Value Engineering.

Systems Basics

The first step towards understanding how to decompose systems is to understand what a system is. A system is defined in terms of its form, fit, and function:

Form
Form captures most of what consider non-functional requirements. For physical equipment it defines things like size, shape, and weight. More generally, it includes things like throughput, capacity, and latency. For software it includes memory consumption and demands on CPU resources.
Fit
Fit defines the interfaces by which products enter and leave the system. For physical systems this might include the form of raw and finished materials, the mechanism by which rubber bonds to the road in a tyre, the shape of the handle and the striking surface of a hammer head. For software this includes protocol specifications, service contracts, and APIs.
Function
Most important amongst a system's characteristics is its function or functions. A function defines the system's input in an relatively abstract way, and how those input are converted into output. A system's function defines what it is "for", why the customer wants to buy it, and what it should be scoped to "do".

Form, fit, and function are characteristics of a system that can be seen as a black box. Look inside the box, and we can see that systems can be made up of a number of things:

Other systems
A system made up of other systems is sometimes called a "System of Systems". Each system stands on its own in performing is functions more or less independently of other adjacent systems. The failure of an adjacent system may cause inputs to change their behaviour, but do not cause wide-spread collapse or knock-on effects. Some functions of the system of systems may fail, but the functions of individual systems continue operating.
Subsystems
A system made up of sub-systems is less robust, and typically lower-level than a system of systems. A subsystem is like a system in that it processes input into output, has a form, and has a fit. Like a system, a subsystem is also composed of lower-level components that operate to perform the subsystem's functions. However unlike a standalone system, subsystems may depend on each other in ways that can compromise each other's functions. We would ideally like to keep decomposing into systems rather than subsystems for as long as possible, but at some point our luck will run out: Software never constitutes a whole system, because it depends on the hardware it is running on.
Configuration Item
It is only necessary to decompose into subsystems if the constituent parts of a system are too numerous or complex to decompose directly into components. If the system is simple, it can be decomposed directly on to Configuration Items. A Configuration Item (CI) is the lowest level of configuration control of a system, meaning it is the lowest level component that still has its own requirements specification and design document. It is still decomposed within its own design, but onto "Components". CIs can be software or hardware, known as Software Configuration Items (CSCIs) and Hardware Configuration Items (HWCIs). Like subsystems, CIs are defined at a high-enough level that they still have a function relevant to the end use of the system.

Function Diagrams

When setting about a functional analysis I like to use StarUML's robustness diagrams. This is essentially an old-fashioned Structured Analysis approach. I find this approach a much more effective starting point for large systems than either use cases or object-oriented analysis techniques. I start with functions included in the context:

Context Diagram

I like to show a portion of the surrounding system of systems, and the relevant functions within adjacent systems. Stakeholders such as operators of the various systems or customers of a business are also useful to show, and create a useful bound for this diagram. If your system can impact on stakeholders through an adjacent system, even if that system is not directly adjacent but a few hops away that system and that stakeholder constitute relevant context.

From a requirements perspective, the analysis is essentially complete with this one diagram. You write a section in the specification for each of the identified functions. You include the list of input and output, describe the non-functional requirements, refer to documentation for system interfaces, and you're done. In practice this process will almost certainly require some rework and reconsideration of the functions as you go. It is important to characterise all flows in and out of the system and to understand what the system is doing with them and to them. Functions that do not share input or output can be safely split at this level. I generally don't have functions feeding into each other at this level, preferring each function to describe end-to-end processing.

Since we are not following a strict waterfall, the architecture specification is probably being written at the same time as the requirements specification. This too will throw up required changes to the set of functions based on its decomposition:

System Decomposition

The decomposition is onto systems, subsystems, or configuration items. Each of the constituent parts also incorporates an allocation of function. Each function from the context is either implemented directly by a standalone component as a whole, or is decomposed into several lower-level functions with interfaces between. The derived interfaces are configuration controlled along with requirements on the constituent parts.

Perhaps the key thing to understand about this decomposition is that it is not purely functional. We aren't picking bubbles willy-nilly, but allowing a functional decomposition to actively interplay with domain-specific engineering practice. That practice is Software Engineering in this case. The architect who puts the decomposition together uses all of the normal skills of their particular engineering profession to use in defining components that are loosely coupled, and appropriately abstracted, and complete. The functional model winds its way down through the components in parallel to the domain-specific engineering activity. The function allocation acts as an architectural abstraction of functional requirements that makes it easy to relate the lowest-level configuration-controlled elements of a system to their purpose in terms of meeting the needs of the end user:

Function Decomposition

This diagram follows some of the form of a value-engineering "FAST" diagram, with high-level system functions on the left and lowest-level functions on the right. Dependency arrows indicate that the high-level function is achieved by performing the low-level function. Reading in the other direction we can see that the low-level function is implemented in order to achieve the high-level function. If we were performing a value engineering workshop we would now add up all of the costs of the lower-level functions in order to come up with a total cost for the higher-level function. That cost could be compared to the worth of the function to determine if the end user need is being met most efficiently, whether gold plating is being applied to functions the end user does not value, and whether it should be applied more to functions the end user does value. Even without a formal analysis, having a diagram like this allows developers who may otherwise be a long way from the customer to understand the place of their work in the wider context and to make reasonable informed decisions about which parts have to shine.

Rules and Advice

I really try to limit the decomposition to what will fit on these three diagrams. For complex software this may not be possible. I am also prone to draw diagrams relating services to processes, processes to software packages, and software packages to configuration items as part of my broader software architectural descriptions. Nevertheless the context, configuration item, and functional decomposition diagrams should essentially be just that: Three diagrams, each fitting on a page, and each easily understood. In addition to this basic set of constraints there are a number of rules around function naming and structure that aid in ensuring the correct level of abstraction and consistency:

Once the architecture is looking neat you'll probably end up with a requirements specification per Configuration Item, plus Interface Control Documents covering all of the identified flows between components. If you are working below the level of the Configuration Item, such documentation would not be warranted.

This is an approach to decomposing relatively large systems with customers who are potentially quite distant from individual developers, so may not be applicable to everyone. It is only one small aspect of building those kinds of systems, and this has been the barest of introductions. I hope you have found it useful.

Benjamin

Sat, 2009-Jun-13

The REST Statelessness Constraint

The Statelessness constraint of REST is important, but often poorly understood. It is more restrictive than the SOA statelessness principle and prohibits a number of useful communication patterns from REST architecture.

REST-style Statelessness

Fielding's Dissertation describes the constraint thusly:

We next add a constraint to the client-server interaction: communication must be stateless in nature, as in the client-stateless-server (CSS) style of Section 3.4.3 (Figure 5-3), such that each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server. Session state is therefore kept entirely on the client.

Section 5.3.1 expands on this with the following:

REST enables intermediate processing by constraining messages to be self-descriptive: interaction is stateless between requests, standard methods and media types are used to indicate semantics and exchange information, and responses explicitly indicate cacheability.

The purposes of introducing the statelessness constraint include improvements to visibility, reliability, and scalability. This means that proxies and other intermediaries are better able to participate in communication patterns that involve self-descriptive stateless messages, server death and failover does not result in session state synchronisation problems, and it is easy to add new servers to handle client load again without needing to synchronise session state.

REST achieves statelessness through a number of mechanisms:

  1. By designing methods and communication patterns that they do not require state to be retained server-side after the request.
  2. By designing services that expose capabilities to directly sample and transition server-side state without left-over application state
  3. By "deferring" or passing back state to the client as a message at the end of each request whenever session state or application state is required

The downside of statelessness is exposed in that last point: Applications that demand some kind of session state persist beyond the duration of a single request need to have that state sent back to the client as part of the response message. Next time the client wants to issue a request, the state is again transferred to the service and then back to the client.

As noted by Roy, this is a trade-off. More network bandwidth is used in order to achieve visibility, reliability, and scalability benefits for the server side. Other REST constraints such as caching are intended to balance this out so that an acceptable amount of bandwidth is used.

Deferred application state is often in the form of resource identifiers such as URLs. This is an ideal form, because it can be stored indefinitely and easily passed around between clients so other clients can continue the the application at a later point in time.

Let's take a simple banking example:

So my client issues a GET request to <http://accounts.example.com/myaccount/transactions>. In my response I get both the current list of transactions, plus a "next" link. This hyperlink is a reference to a resource whose identity incorporates where I am currently up to, in this case it might be <http://accounts.example.com/myaccount/transactions?dtstart=2009-06-11>. This link allows me to unambiguously go back and fetch only new transactions, yet the service doesn't have to remember where I was up to in the transaction list. The dtstart identifier that marks the boundary between the transactions I have seen and those I have not has been deferred back to me as part of the returned message.

If the client is an accounting program, this will allow me to reconcile my accounts with those of my bank in a nice unambiguous way. If I have to restore my accounts from backup, the backup will have the old URL and will query from the right place. If I lose my data completely I can go back to using the original transaction URL. I can handle services being upgraded, and individual servers being added or failing over. They don't need to know where I was up to until I send my next request.

The gold standard of stateless interaction is that application state is first reduced to its minimum possible set, then any remaining necessary application state is stored with the client and not with the server.

A SOA take on Statelessness

If you read the excellent Principles of Service Design by Thomas Erl you will find statelessness as a key SOA principle. However, the mechanisms described to support statelessness revolve around the use of state databases on the server side. This is not statelessness from a REST perspective, which would see the entire service including it state database as "the server side". Deferring state within the service does not meet the stringent REST criteria for this constraint. That is not to say these kinds of databases aren't useful from time to time, but they have no place in a strict REST architecture.

REST Statelessness in detail

Some people will tell you that the statelessness constraint can be worked around by giving the session state a resource identifier on the server side. Doing so certainly could improve the visibility of state, and its addressability through hyperlinks. However, it does not address the core reliability and scalability concerns that the REST constraint seeks to address.

There is a subtlety in the statelessness constraint that can make it difficult to grasp. That subtlety is the difference between application state (or session state) and service state. Application state is the state that is built up as part of a client and server interacting in order to fulfil application requirements. From a SOA perspective this might be seen as the state of a service composition as opposed to the state of an individual service. The difficulty is that you can't just point to state on the server side and say that it must be service state. It could be application state. Certainly taking application state and storing it on the server side does not magically transform the state into service state.

At this point it is probably worth trying formalise the definition of these two alternative forms of state. I think the best way to do this is to say that service state is the state a service holds independently of the state of any client, while application state is the state being held in some kind of mirroring relationship with a client.

Let's take for example a straightforward PUT request. I generate some state on the client, and through the PUT request I transfer that state (ie intent) to the server side. If the server accepts my request it will update its service state to match my intent. When it returns a response back to me as the client I can forget this state. It has been transferred, and is no longer my responsibility. It is service state, and not application state.

I can do an optimistic lock completely within REST constraints. Say I get an ETag back from an initial GET request. I can now submit a conditional PUT request to update that state only if the ETag matches what I saw in my GET. This is stateless because while the server needs to remember that that particular resource had that particular ETag and has to remember to update it when its state changes, it doesn't really care about me. How many clients have a reference to this ETag, and therefore hold an optimistic lock? None? One, a thousand? The server doesn't need to keep track. It just has to honour the interface of the conditional PUT request.

One more example of a communication pattern that can be done within the RESTful statelessness constraint is the ability to grab a set of changes to a resource rather than refetch the entire representation. This may at first sound challenging, but consider a service that keeps a buffer of recent changes. It might keep the last hundred changes, or the last hour's worth, or use some other measure. Now, each time a client requests the set of changes from a defined point it can return this set. If the client requests changes that are too old, it can be directed back to the complete representation. Again this is stateless. The server will store the same number of changes and the same set of updates regardless of the number of clients. One? None? A thousand? Again, the server's consumption of limited memory resources is not affected by the number of clients that might be trying to stay synchronised with its resource state.

Communication patterns prohibited: pub/sub and pessimistic lock

Two communication patterns that are excluded from a REST architecture are pessimistic locking and the publish/subscribe pattern. These are not stateless, and there are two immediate clues that tell us this. Firstly, there is a current client expectation and state that is reflected on the server side between requests. Secondly, the server will have a timeout in both of these cases to deal with a client silently going away. Let's dig into these scenarios a bit further.

A pessimistic lock requests that the service prevent modification of the state behind a resource until the lock is released. This will typically involve some kind of LOCK request being sent to the server, and once the lock is created the client can go about its business before eventually submitting a request with its new intent for the resource or resources. Pessimistic locking is often needed more as systems scale to avoid the race conditions that can result from an optimistic locking approach. Importantly, pessimistic locks are typically the foundation of transactions both across multiple resources exposed by a service and across multiple services. These transactions can make the difference between a particular service composition being possible or impossible.

Unfortunately, the lock maintained by the service demonstrates all of the classic features of application state. The number of locks increases with the number of concurrent clients. Each lock consumes server-side resources and in this case also restricts concurrency between clients. Server-side state and client-side state is synchronised, with both client and service having knowledge of the lock between requests. Finally, the server side will always have to maintain the lock in terms of a time-limited lease. If a client goes away without releasing its lock, the server needs to eventually clean up the lock resources or become unavailable. Giving a pessimistic lock a resource identifier and allowing a DELETE request to be issued to it does not alter these conditions.

Publish/subscribe carries a similar fate. If I SUBSCRIBE to a resource, I create a subscription state object somewhere within the service that I as a client depend on between requests. The server calling me back is nothing to be concerned about, but the server-side subscription state is application state. Again the number of subscriptions increases with the number of concurrent clients, even between requests. Each subscription consumes sever-side resources, server-side state and client-side state is synchronised between requests, and subscriptions will always need to be obtained on a time-limited lease to facilitate server-side cleanup. Roy Fielding explicitly argues against the use of pub/sub on the Web, so you can go and read about this trade-off in his own words.

Bending REST Statelessness

Of all the REST constraints, statelessness and the communication patterns it prohibits grates most strongly at the enterprise scale. This isn't the Internet, where hoards of clients can suddenly appear and knock your service off the Web. There aren't enough clients to warrant extreme attention to scalability. Existing enterprise systems exhibit the banned patterns, lending credence to arguments that statelessness between requests is overly strict in this context.

This argument has merit. REST was designed for the Web, and doesn't particularly have the concerns of service compositions as its core objectives. REST seeks to make things work on the big scale, and for that you need to make some harsh decisions. On the Web it can make sense to require clients to actively poll a server for new updates, rather than maintain subscriptions on their behalf. The cost of extra messages and extra delay is often less than the cost of maintaining per-client state on the sever side. The enterprise has a different set of trade-offs that will often shift this balance the other way.

From a REST purist perspective we can tolerate stateful interactions only within the boundaries of a single service or a single client. It is perfectly OK to wrap up non-conforming interactions within a service boundary, so long as the interface it exposes does not include these features. In the same way as a WS-* interface can wrap up a legacy system for simpler and more unified access, a REST interface can wrap up a set of WS-* services for simpler access again and more scalable interactions on the larger scale.

Architects that choose instead to break the REST constraint may certainly do so, but should not claim their architecture is RESTful. It might be REST with some exceptions, and this "with exceptions" approach probably the right for most enterprise architectures. Some exceptions even appear on the Web itself. Bear in mind that by making exceptions reliability and scalability will be impacted, but sufficient money thrown at these problems along with only moderately large architectures will not yield too many adverse affects. As always: Understand the consequences of your decisions, and think for yourself.

Conclusion

It is my opinion that constraints will be bent and specifications that bend REST constraints will be more widely accepted as HTTP and REST thinking become more integral parts of the Enterprise. HTTP in the future will have to serve the dual masters of Web and Enterprise, and find the least perverse ways of matching the two. We have already seen one attempt to fork HTTP for the enterprise in the form of the Web Services stack. The lack of synergy between the Web and WS-* has since rebounded back on its authors, and many of us are back at this point of having to decide whether to embrace and extend HTTP or to produce yet another alternative.

I tend to think that both the Web and the Enterprise will be well served by a minimal set of extensions to HTTP that support some features currently common at the enterprise scale. In particular, I see cross-service transactions and publish/subscribe as important in some use cases. I would rather define these in a way that breaks REST and is synergistic with HTTP than to start again from scratch. I release that by adding non-compliant features to HTTP we risk that these will leak out onto the web in a way that harms the features of Web architecture, however I see this risk as moderate compared to the potential benefit of bringing the Web and Enterprise into rough protocol alignment. Whether it is the job of the w3c, IETF, or another body to build these non-conformant synergistic HTTP extensions is not quite clear to me. If this sort of work can't be housed within the IETF I suspect a new standards body would have to be formed along the lines of OASIS.

Benjamin

Wed, 2009-May-27

Specification for an REST Asynchronous Request class

What should a client implementation of a REST interface look like in order to take full advantage of the evolutionary mechanisms built into REST directly and more broadly into the Web? More importantly, how should clients be written to best benefit the architecture as a whole and permit gradual evolution of both media types and resource identifiers? I thought I would put a few words together to specify what this should all look like.

A Single Asychronous Request

I'm going to position this specification in terms of an asychronous request that a client wishes to issue to a server. If you prefer, you can think of this as a single asynchronous request that a service consumer issues to a service. From a SOA perspective some base part of the URL (often the authority of the URL) will identify the service. Anything past that point is a fine-grained identifier for a resource the service owns, and each resource presents a uniform interface consisting of media types only from the centralised schema inventory and methods only from the centralised method inventory.

I am framing this in terms of a class that models a single request because I want to be clear what should generally be a responsibility of calling code versus what should be handled automatically and efficiently by the REST client framework. This single request should be essentially as easy to invoke as any capability of a SOA service, and just as easy to understand despite the magic.

Objectives

The main objectives of this class are to correctly handle retries, redirection, and content negotiation. Data folding is also a potential bonus. Retries allow for reliably issuing requests, but are only appropriate for idempotent and safe requests. Redirection allows the set of resource identifiers in the architecture to evolve over time as services are upgraded without requiring client upgrade. Content negotiation allows services and their consumers to be upgraded independently without breaking compatibility with each other as particular media types are deprecated and eventually phased out over the lifetime of the architecture.

These are all features designed to support run-time discovery of resources (known as hyperlinking) and therefore run-time discovery of services by clients. This discoverability is designed to continue working over the lifetime of the architecture without requiring mass-redeployment or mass-reconfiguration of components. Each component continues to do its job, following and directing clients to follow links as required, and stating its own capabilities to the extent necessary for components around them both old and new to continue interoperating with them.

A Simple Class

Let's assume that we have a HTTP implementation that is able to make individual requests on our behalf. I say HTTP not because it is the only RESTful protocol, but because it is a common and reasonably exemplary one. It incorporates many features from Roy's specification alongside a few Web-specific features.

The class we will look at initially covers most of our objectives. It will deal with retry, redirection and data folding. I will leave content negotiation for an advanced version of the class.

Simple AsyncRequest class

This at first looks simple: As a client object, you construct an AsyncRequest with the URL of your resource. Once constructed you invoke a request with the required method, plus an optional media type and content to send with the message. Media type and content can both be treated as strings (or byte arrays) and would be the body of a HTTP request. Method is probably a simple string, but more complex requests might require the inclusion of additional header information.

When the response is received, AsyncRequest will invoke a callback on the client. Code is probably a numeric HTTP code, however it should at least be easily convertible into an easy success/fail status and as with method may have associated headers. You might use a code class that can return this information from a function for easy synthesis while retaining the ability to log full response details for analysis in the case of failure. Media type and content are again strings or byte arrays, and have the same semantics as they do in requests.

Behaviour

The AsyncRequest class will attempt to make its request to the service over a TCP transport. Failures in the transport can be modelled as HTTP response codes. Failure to connect can be modelled as 503 Service Unavailable, as the request is known not to have been processed. Failure after connection can be modelled as 504 Gateway Timeout, where it cannot be discerned whether the request was processed or not.

On initial construction, myURL and myConfiguredURL are both set to the specified URL. myProxy is cleared unless an explicit proxy is configured. The myURL value normally determines which DNS name or IP address to connect to at the TCP level. However, this is overridden by myProxy if it is present. The myURL value is sent as part of each request, including retries. myURL and myProxy may be modified by temporary redirection codes, while myConfiguredURL is only modified in the case of permanent redirection. At the end of any failed or successful request myURL is set back to myConfiguredURL and myProxy is either cleared or returned to a configured value.

Various HTTP response demand different action. These are return (success or fail), retry, modify and retry, sleep and retry.

CodeAction
100 ContinueContinue request - only returned if Expect: Continue was in request
101 Switching ProtocolsContinue request - only returned if an Upgrade header was in request
1xx Other InformationalReturn, failure
200 OKReturn, success
201 CreatedReturn, success (include location header)
202 AcceptedReturn, success (so far)
2xx Other SuccessfulReturn, success
300 Multiple ChoicesReturn, failure (no mechanism exists to support this code)
301 Moved PermanentlyModify myURL and myConfiguredURL to match Location header and retry
302 FoundModify myURL only to match Location header and retry
303 See OtherModify myURL only to match Location header, set myMethod to GET, and retry
304 Not ModifiedReturn, success
305 Use ProxyModify myProxy only to match Location header and retry
307 Temporary RedirectModify myURL only to match Location header and retry
3xx Other SuccessfulReturn, failure
400 Bad RequestReturn, failure
401 UnauthorisedReturn, failure (you should almost certainly be using SSL if authentication is important, although use of a special class to handle challenges could be implemented)
404 Not FoundReturn, success if request was DELETE otherwise failure
408 Request TimeoutRetry
410 GoneReturn, success if request was DELETE otherwise failure
416 Requested Range Not SatisfiableModify Range header according to Content-Range header and retry
417 Expectation FailedDrop expectation if possible and retry, otherwise return failure
4xx Other Client ErrorReturn, failure
500 Internal Server ErrorReturn, failure
503 Service UnavailableSleep for indicated time and retry
504 Gateway TimeoutRetry if request is safe (GET) or idempotent (PUT or DELETE). Otherwise, return failure.
5xx Other Server ErrorReturn, failure

The handling of these codes requires some rewriting and retrying of requests, and also potential sleeps. However, this can be handled transparently by the AsyncRequest class for the most part and the client does not have to be concerned. This can all happen behind the scenes, and therefore consistently across different clients and services to benefit the overall flexibility and evolution of the architecture.

Data folding is the concept that it is beneficial to miss intermediate states in favour of a correct current state. Data folding for GET requests relies somewhat on the class using this request, still. They would generally start the first request, then if they decide they need another request would note this fact for themselves while waiting for the response. If yet another need to send a request popped up while the first is outstanding, it does not need to be noted. The client will send its queued GET request as soon as the current one returns. We generally don't want to cancel a GET request in this context in case we get ourselves in an infernal loop of continuously cancelling requests while they are still outstanding.

Data folding also applies to PUT and DELETE requests, each of which is designed to completely replace the effect of the previous request on the same resource. As such, if we have a PUT or DELETE request in progress and another comes in we can simply queue the latest of these requests for a given URL. This allows us to convey our latest intent for the state of a resource without unnecessary delay attributable to intermediate states.

Through both of these data folding techniques we are removing unnecessary queuing within the architecture that can lead to increased system latency and eventual system meltdown. In fact, simply through the use of an object to model the request state we can easily keep track of how many of these objects and their corresponding requests we currently have outstanding and keep this queue under control as well. Data folding support can be wrapped up in its own class for the convenience of clients, or possibly even into the main request class.

Adding content negotiation

Advanced AsyncRequest class, supporting content negotation

In order to correctly support content negotiation we must give over our content to the AsyncRequest class in an encoding-neutral form. I have shown this as an abstract class in the above diagram called "Encoder". What this does is allow request content such as that from a PUT to be encoded as several possible media types. A default is typically selected and the content encoded in that form for transmission. However, if this type is not acceptable to the server we will hopefully get back enough information to retry the encoding with an acceptable type. An example where this kind of thing would be useful would be for an architecture transitioning from RSS to atom news feeds. An upgraded client may try to PUT an atom article to its server, but the server only accepts RSS. The server rejects the initial PUT request (perhaps with some expect-continue going on behind the scenes) and informs us (hopefully through an Accept header in the response or similar) that it does support rss. We ask our encoder to format the document in the legacy RSS form, and can continue our operation to a success state.

On the return side we have a Parser class to interpret responses. Its set of acceptable types is first interrogated as we send a request, and this information included for transmission to the server. A correctly-implemented server will return us a document in one of the acceptable types, and that document will be passed through the Parser on its way back to the advanced client. The client cares only about the information contained in the document, not in the encoding format. Therefore, the content is parsed into a common data structure appropriate for Client processing.

This picture is a little more complicated than the simple picture described previously, and could be simplified slightly. For example, the Client could incorporate Parser and pass its list of acceptable types directly into the AsyncRequest request method. However even in this structure it is likely that you would want to separate out the parser code so that it could be used by multiple clients that share the same required data structure. This is particularly the case for very simple types such as numbers and strings that may be able to be easily extracted from a number of different media types.

Conclusion

The important features of HTTP in support of REST architectural constraints and objectives should be implemented consistently across all HTTP clients. The architecture as a whole suffers if they are missing or difficult to use as evolution requires simultaneous upgrade of multiple components. Supporting redirection and content negotiation at the very least, plus sleeping when a service is under load and observing other responses means that we are more flexible in how we can modify and operate our systems both large and small.

A key to success in this area is to make the client implementation as simple as possible, and really zero additional effort to support these features. A good interface design in this area can lead to better architectural outcomes.

Benjamin

Tue, 2009-May-05

The text/plain Semantic Web

Perhaps the most important media type in an enterprise-scale or world-scale or architecture is text/plain. The text/plain type is essentially schema free, and allows a representation to be retrieved or PUT with little to no jargon or domain-specific knowledge required by server or client. It is applicable to a wide range of problems and contexts, and is easily consumed by tools and humans alike.

Uses of text/plain

In essence, this type conveys a string. However, we can also think about embedding numbers or other simple data types. The modern dynamic language approach to looking at strings is to allow implicit conversion between the information inserted by the sender and the type expected by the consumer. These values can easily be incorporated into programming language data types, inserted into databases, spreadsheets, reports, or other structures.

To outline a few potential uses of text/plain, consider the following interactions

Standards and compatibility

While formatting of numbers and other types may seem natural enough, it is important that this be done consistently if the information is to remain legible when it is processed. To my mind the best resource in formatting and processing of simple text-compatible data types can be found in the specification for . Part 2 contains a section on built-in datatypes that covers a range of string, numeric, URI, date and time, and other simple types. Any data that can be formatted according to the rules in this section absolutely should be.

However, this leads to a dilemma. What do we do with types that are not found in this set? Should a geo-location become a structured XML document, or should it too be coded as text/plain? rfc2426 defines a semi-colon-separated standard format for geo-location, which could certainly be coded as text/plain. However, it is not clear at this stage that this is or will be the canonical way of encoding this information as a text/plain document. Without reference to applicable and universal standards we bear a significant risk that the partially-formatted content we transfer will in fact not be understood.

Applicability of text/plain MIME type

Part of the problem that emerges is that text/plain is not specific enough. It doesn't have sub-types that are clearly tied to a specification document or standards body. This makes interoperability a potential nightmare of heuristic detection.

Unfortunately, while XSD provides an excellent catalogue of basic types it is neither comprehensive nor sufficiently connected to MIME usage. Another problem with using text/plain in its bare form is its default assumption of a US-ASCII character type. This can lead to obvious problems in a modern internationalised world.

Without being backed by some kind of standards body, the advice I give in this regard is merely that. Standards may emerge later that contradict what I have to say here. That said, my advice is this:

  1. Treat text/plain content as being formatted according to XSD conventions when you recieve it. Take care to process character encoding directives correctly and support at least a utf-8 encoding.
  2. Consider using a text/xsd+plain document type when transmitting XSD-formatted simple content. This will hopefully indicate that the document can be understood as text/plain, but provide additional context if more complex processing is applied to the document.
  3. Make use of other specialised types that indicate the standard being applied when types outside of the XSD set are employed. For example, the geo coordinates above might be described as text/vcard+plain.

Again, ideally we would be making use of a well-defined standards body to own and maintain the media types used to communicate very basic information. Making up your own can only take the state of the art so far. However, standards sometimes emerge out of common best practice... so it is not a complete waste of time to be heading down this particular path.

When not to use text/plain

It should be clear that text/plain is not a tool for every occasion. It is often important to sample or send an atomic set of data that would require additional schema. Plain text when overused can lead to performance problems as individual values are sampled one by one instead of as a consistent and coherent document.

Perhaps the clearest indication that you are overusing text/plain is that you are experiencing an explosion in hyperlinks. When you start to need a document to provide links for consumers to find these text/plain-centric resources, you should probably consider incorporating the information directly into these documents themselves.

Used appropriately to transfer information to and from well-known and stable resources, text/plain or its variants can be an efficient way to communicate simple data without introducing unnecessary jargon. The URI of the resource and the implementation of client and server will provide sufficient context to format and process these simple data types.

The low barrier to entry to these types makes them universally applicable and easy to work with, however the lack of standardisation around matching encodings to media types is an inhibitor to their potential uptake. Used well, especially in combination with link headers and/or text/uri-list these types can provide an effective to way to make your protocols get out of the way of communication and let clients and servers interoperate with minimal complexity for simple use cases.

Benjamin

Mon, 2009-Apr-27

REST Service Description

The lack of a service description language for REST services has caused some confusion in camps with a history of and . has attempted to fill the gap, but has received a mixed response amongst theorists and developers.

A REST architecture by definition has a Uniform Interface. The uniform interface is made up of a common "standard" set of application protocol messages such as those defined by HTTP, as well a common "standard" set of media types such at html and atom. The web attributes fairly low-level semantics to these media types. A more machine-centric REST architecture such as an enterprise architecture or a semantic web will have a richer set of types that incorporate sufficient semantics for the particular problem domain.

The method and media type facets of a uniform interface are generally heavy on human-understandable specifications. A REST architecture can generally tolerate this because there will be a much smaller set of specifications required than in a conventional enterprise architecture. Existing tools can be employed where this becomes unmanageable, for example XML schema.

The final facet of a REST service interface is the set of resource identifiers, and the detailed semantics of issuing specific requests such as GET and PUT upon these resources. In REST theory the structure of these identifiers must be uniform, but the set is expected to evolve over the lifetime of the architecture. The set of URLs on the Web is governed by individual service owners, and not by the IETF or the w3c.

The argument against a REST-specific interface description language is that the uniform interface of a REST architecture is already well served by existing human-readable and machine-readable technologies. Consumers of REST services should be coupled to the uniform interface at design time, then free to follow hyperlinks at run-time to interact with whatever services might be present in the architecture. There is a danger in promoting service description languages and consumer-visible service description in general that we will start introducing service-specific dependencies at development time to consumers. The natural outcome of this will be tight coupling and failure to meet various architectural objectives. In particular, the ability for the architecture to withstand incremental upgrade over a long period of time can easily be compromised by service-specific knowledge becoming encoded into clients.

The main use case for a service interface description language is at design time. This means that ideally we would not expose this language at all consumers of our REST services. Two basic use cases exist for such a language:

  1. To provide mass-publication of URLs to clients at run-time, and
  2. To assist in developing and analysing of service properties at development time

The first use case is fraught with problems both theoretical and practical. Firstly we have the risk that such a document will be used at design-time within clients, limiting the potential longevity of the architecture as a whole. Secondly, a language expressive enough for a machine to understand at run-time is either going to be extremely complicated or have fairly narrow applicability.

The second use case is more compelling. Here we are really looking into the computer-aided development of REST services without contravening architectural constraints. This is not to be discouraged. Here it is useful to consider what a REST service architect would likely do in developing a service.

We can safely assume in general that the uniform interface is designed, and any changes required of it to introduce our service are being negotiated in the appropriate forum. The service description language does not need to specify the generic meaning of methods or of media types. The service description can thus be reduced to a matrix:

URI Template Method Media Types Semantics
http://example.com/invoice/{invoice} GET application/invoice+xml Retrieves the invoice information set of the specified invoice
http://example.com/invoice/{invoice}/paid PUT text/plain (xsd:boolean format) Sets an invoice to paid or unpaid

Method and media type identifier should be sufficient to reference a full specification of these facets of the uniform interface. The URI Template should describe one or more URIs that will be valid for the service or the set of services. Sufficient guidance on percent-encoding of characters should be available either implicitly or explicitly in order to guide developers and analysts. Semantics may require more detail, and should support similar descriptions to those that might be found in a javadoc or doxygen function description.

It is important that space exists to specify alternate document types to allow for evolution of the architecture's media types. For example, a service that once supported rss may need to evolve to additionally support atom in order to support both old and new service consumers. Different options in this column indicate a requirement to support content negotiation to return data in the specified format.

The addition of new service-specific methods and media types should be look upon as part of the gradual evolution of the REST architecture. Their specifications should not appear as part of the service specification, but instead be referenced by it. On the Web this means going through the standards bodies that govern the Web. In an enterprise this means going through the appropriate enterprise architecture team.

In some cases your service will be the first on the Web to make use of a particular method or media type. In these cases you will need to make a decision about whether you intend to blend your extensions into the fabric of the Web architecture or whether you intend to co-host your architecture with the Web. In the former case we are again talking about standardisation through bodies such as the IETF. In the latter case we are potentially talking about setting up a new governing body that meets the needs of your particular problem domain.

Importantly, a REST service description never sets out to define service-specific methods or media types. One service does not a REST architecture make. A REST architecture always consists of a set of services and their consumers operating through a uniform interface that does not expose service-specific or consumer-specific details at design time.

A service description language in support of computer-aided development and analyst-friendly REST service description will consist of the fields indicated above. Semantics are indicated for a specific method, on a specific set of URIs, with a specific set of content types. The service-specific interface of a REST service is defined by these fields.

For developers of software that implements services, this service-specific interface specification can be an invaluable reference and a sound source of information for user manuals and other integrator-visible documentation.

Benjamin