Uniform Interface is the poster boy of the REST constraints. It attracts
much of the interest, and much of the controversy. Less-often discussed is the
equally important and far more controversial
Stateless Constraint.
Specifically, the constraint is that services in a REST-style architecture are
stateless between requests. They don't carry any session state on behalf of
clients that aren't currently in the process of making requests.
Services are typically near the core of a network. Network cores often have
great storage, bandwidth, and compute resources but also great demands on
these resources. Services are responsible for handling requests on behalf of
their entire base of consumers, which on any large network will be a
significant set. Nearer to the edge of the network are the consumers.
Paradoxically, the resources available in the form of cheap desktop hardware
and networking equipment typically have more available capacity at this network
edge than is present near the network core where big-iron scaling solutions
are being employed. This is due to the large number of consumers out there,
typically orders of magnitude more than exist a data centre. Spare resources
are relatively fast, large, and responsive nearer to the ultimate user of
the system.
On the down-side, nodes near the edge of a network tend to be less reliable
and more prone to unwanted manipulation than those near the network core.
While big-iron scaling solutions near the core are important, any architecture
that really scales will be one that seeks to make use of the resources available
near the network edge. Roy envisages a
RESTful
architecture, where most
consumers are in a "REST" state most of the time. This is a concept
intrinsically linked to statelessness as well more obliquely to notions of code
on demand, cache, and the Web's principle of least power.
Stateless
The first step towards a REST scalability utopia is to move as much storage
space from services to the edge of the network as possible. This is a balancing
act. You don't normally want to move security-sensitive storage to the edge of
the network, nor store any information that you have promised to keep in the
less-reliable edge nodes of the network. There is also some state associated
simply with underlying transport protocols such as TCP that cannot be
eliminated. However, the less information that
is stored by the service the better it will be able to cope with the demands
of its consumers. REST sets the bar for statelessness at the request level:
No session state needs to be retained by the service between requests in a REST
architecture for normal and correct processing to occur. The service can forget
any such state and will still understand the consumer's request within the
session.
The scalability effect of this constraint is that session state is moved
back to the service consumer at the end of each request. Any session state
required to process subsequent request is included in those subsequent requests.
The session state flows tidally to and from the service rather than being
retained within
the service. Normal service state (information the service has promised to
retain) still resides within the service and can be read, modified, or added
to as part of normal request processing.
I have written before
about the difference between session state and service state, so I won't go
over that ground again today.
Applying this constraint has positive and negative effects. On the plus side,
the service must only provision storage capacity sufficient deal with its own
service state. It no longer has to
deal with a unit of session state for each currently-active service consumer.
The service can control the rate at which it processes requests and only has
to cope with the session storage requirements of those it is currently
processing in parallel. It may have a million currently-active consumers, but
if it is only processing ten requests at a time then its session storage
requirements are bounded to ten concurrent sessions. The other 999990
sessions are either
stored within the related service consumer or are currently in transit between
the service and related consumer. Sessions are expensive for services to store,
but cheap for consumers. Session state is also often invalid if the consumer
terminates, so if the session happens to be lost when this occurs there is
typically no negative effect.
The negative impacts of statelessness include the extra bandwidth usage for
that tidal flow of state, as well as the prohibition of really useful patterns
such as publish/subscribe and pessimistic locking. If the service is able to
forget a subscription or forget a lock, then these patterns really don't work
any more. These patterns are stateful and force a centralisation of state back
to services near the network core.
Cache
Caching
is often talked about as a scalability feature of REST. However, it
exists primarily to counter the negative effects of stateless on the
architecture. Stateless introduces additional bandwidth requirements between
services and consumers as session state is transferred more frequently, and
we may have additional processing overhead on the service to deal with
consumers polling for updates when they previously could have made use of
a stateful event-based message exchange pattern. Caching seeks to eliminate
both problems by eliminating redundant message exchanges from the architecture.
This reduces bandwidth usage as well as service compute resources down to the
minimum possible set, ensuring that the stateless architecture is a feasible
one.
A cache positioned within a service consumer reduces latency for the client
as it makes a series of network requests, some of which will be redundant.
The cache detects redundant requests and reuses earlier responses to respond
quickly in place of the service. A cache positioned at a data centre or
network boundary is principally concerned with reducing bandwidth consumption
due to redundant requests. A cache positioned within the service itself is
primarily concerned with reducing processing overhead due to redundant requests.
The Web's principle of least power and code on demand
Now that we have moved unnecessary storage requirements to the edge of the
network and reduced network bandwidth to a minimum, the obvious next step is
to try and reduce our service-side compute requirements. The Web offers its
standard approach of the
principle of least power.
This principle essentially says that if you provide information instead of a
program to run that consumers of your service will understand the content and
be able to process it in useful and novel ways. The compute implication of this
is that you will often be able to serve a static or pre-cached document to your
consumers with practically zero compute overhead. The service consumer can
accept the document, understand it, and perform whatever processing it
requires.
REST adds the concept of code-on-demand. While something of an
anti-principle-of-least-power, it serves more or less the same purpose as far
as scalability is concerned: It allows the service to push compute power
requirements out to the edge of the network. Instead of actually executing
the service composition, a BPEL engine could simply return the BPEL and let
the consumer execute it. Hell, it could happily drop the BPEL processor itself
into a virtual machine space offered by the service consumer and run it from
there. So long as there is nothing security-sensitive or consistency-sensitive
in the execution you have
just saved yourself significant compute resources over the total set of
capability invocations on the service. If you are lucky, the files will already
be cached when the consumer attempts to invoke your capability and the request
won't touch the service at all.
The Web's use of applets, javascript, html, and pretty much everything
else it can or does serve up demonstrate how compute resources can be delegated
out to browsers and other service consumers in order to keep services doing
what they should be doing: Ensuring that the right information and the right
processing is going on without necessarily doing the hard work themselves.
Conclusion
Between the offload of storage space offered by the REST stateless
constraint and the offload of compute resources offered by code on demand
and the principle of least power, REST significantly alters the balance of
resource usage between services near the core of the network and service
consumers nearer to the edge of the network. Service consumers place no
demands on bandwidth, cpu, or storage except when they have requests
outstanding. Services are able to control the rate at which they process
requests, and the network itself controls the bandwidth that can be consumed
by requests and responses. Caching ensures this approach is
feasible in most circumstances for most applications.
If you are considering investing in additional
hardware scaling mechanisms, make sure you also consider whether applying these
architectural constraints would also make a difference to the scalability of
your services.
Benjamin