Sound advice

Uniform Interface is the poster boy of the REST constraints. It attracts much of the interest, and much of the controversy. Less-often discussed is the equally important and far more controversial Stateless Constraint. Specifically, the constraint is that services in a REST-style architecture are stateless between requests. They don't carry any session state on behalf of clients that aren't currently in the process of making requests.

Services are typically near the core of a network. Network cores often have great storage, bandwidth, and compute resources but also great demands on these resources. Services are responsible for handling requests on behalf of their entire base of consumers, which on any large network will be a significant set. Nearer to the edge of the network are the consumers. Paradoxically, the resources available in the form of cheap desktop hardware and networking equipment typically have more available capacity at this network edge than is present near the network core where big-iron scaling solutions are being employed. This is due to the large number of consumers out there, typically orders of magnitude more than exist a data centre. Spare resources are relatively fast, large, and responsive nearer to the ultimate user of the system. On the down-side, nodes near the edge of a network tend to be less reliable and more prone to unwanted manipulation than those near the network core.

While big-iron scaling solutions near the core are important, any architecture that really scales will be one that seeks to make use of the resources available near the network edge. Roy envisages a RESTful architecture, where most consumers are in a "REST" state most of the time. This is a concept intrinsically linked to statelessness as well more obliquely to notions of code on demand, cache, and the Web's principle of least power.

Stateless

The first step towards a REST scalability utopia is to move as much storage space from services to the edge of the network as possible. This is a balancing act. You don't normally want to move security-sensitive storage to the edge of the network, nor store any information that you have promised to keep in the less-reliable edge nodes of the network. There is also some state associated simply with underlying transport protocols such as TCP that cannot be eliminated. However, the less information that is stored by the service the better it will be able to cope with the demands of its consumers. REST sets the bar for statelessness at the request level: No session state needs to be retained by the service between requests in a REST architecture for normal and correct processing to occur. The service can forget any such state and will still understand the consumer's request within the session.

The scalability effect of this constraint is that session state is moved back to the service consumer at the end of each request. Any session state required to process subsequent request is included in those subsequent requests. The session state flows tidally to and from the service rather than being retained within the service. Normal service state (information the service has promised to retain) still resides within the service and can be read, modified, or added to as part of normal request processing. I have written before about the difference between session state and service state, so I won't go over that ground again today.

Applying this constraint has positive and negative effects. On the plus side, the service must only provision storage capacity sufficient deal with its own service state. It no longer has to deal with a unit of session state for each currently-active service consumer. The service can control the rate at which it processes requests and only has to cope with the session storage requirements of those it is currently processing in parallel. It may have a million currently-active consumers, but if it is only processing ten requests at a time then its session storage requirements are bounded to ten concurrent sessions. The other 999990 sessions are either stored within the related service consumer or are currently in transit between the service and related consumer. Sessions are expensive for services to store, but cheap for consumers. Session state is also often invalid if the consumer terminates, so if the session happens to be lost when this occurs there is typically no negative effect.

The negative impacts of statelessness include the extra bandwidth usage for that tidal flow of state, as well as the prohibition of really useful patterns such as publish/subscribe and pessimistic locking. If the service is able to forget a subscription or forget a lock, then these patterns really don't work any more. These patterns are stateful and force a centralisation of state back to services near the network core.

Cache

Caching is often talked about as a scalability feature of REST. However, it exists primarily to counter the negative effects of stateless on the architecture. Stateless introduces additional bandwidth requirements between services and consumers as session state is transferred more frequently, and we may have additional processing overhead on the service to deal with consumers polling for updates when they previously could have made use of a stateful event-based message exchange pattern. Caching seeks to eliminate both problems by eliminating redundant message exchanges from the architecture. This reduces bandwidth usage as well as service compute resources down to the minimum possible set, ensuring that the stateless architecture is a feasible one.

A cache positioned within a service consumer reduces latency for the client as it makes a series of network requests, some of which will be redundant. The cache detects redundant requests and reuses earlier responses to respond quickly in place of the service. A cache positioned at a data centre or network boundary is principally concerned with reducing bandwidth consumption due to redundant requests. A cache positioned within the service itself is primarily concerned with reducing processing overhead due to redundant requests.

The Web's principle of least power and code on demand

Now that we have moved unnecessary storage requirements to the edge of the network and reduced network bandwidth to a minimum, the obvious next step is to try and reduce our service-side compute requirements. The Web offers its standard approach of the principle of least power. This principle essentially says that if you provide information instead of a program to run that consumers of your service will understand the content and be able to process it in useful and novel ways. The compute implication of this is that you will often be able to serve a static or pre-cached document to your consumers with practically zero compute overhead. The service consumer can accept the document, understand it, and perform whatever processing it requires.

REST adds the concept of code-on-demand. While something of an anti-principle-of-least-power, it serves more or less the same purpose as far as scalability is concerned: It allows the service to push compute power requirements out to the edge of the network. Instead of actually executing the service composition, a BPEL engine could simply return the BPEL and let the consumer execute it. Hell, it could happily drop the BPEL processor itself into a virtual machine space offered by the service consumer and run it from there. So long as there is nothing security-sensitive or consistency-sensitive in the execution you have just saved yourself significant compute resources over the total set of capability invocations on the service. If you are lucky, the files will already be cached when the consumer attempts to invoke your capability and the request won't touch the service at all.

The Web's use of applets, javascript, html, and pretty much everything else it can or does serve up demonstrate how compute resources can be delegated out to browsers and other service consumers in order to keep services doing what they should be doing: Ensuring that the right information and the right processing is going on without necessarily doing the hard work themselves.

Conclusion

Between the offload of storage space offered by the REST stateless constraint and the offload of compute resources offered by code on demand and the principle of least power, REST significantly alters the balance of resource usage between services near the core of the network and service consumers nearer to the edge of the network. Service consumers place no demands on bandwidth, cpu, or storage except when they have requests outstanding. Services are able to control the rate at which they process requests, and the network itself controls the bandwidth that can be consumed by requests and responses. Caching ensures this approach is feasible in most circumstances for most applications. If you are considering investing in additional hardware scaling mechanisms, make sure you also consider whether applying these architectural constraints would also make a difference to the scalability of your services.

Benjamin

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My recent bookmarks

WS-REST 2010

Scaling through the REST "stateless" constraint

Stateless

Cache

The Web's principle of least power and code on demand

Conclusion

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My current feeds

My recent bookmarks

WS-REST 2010

Scaling through the REST "stateless" constraint

Stateless

Cache

The Web's principle of least power and code on demand

Conclusion