One of the most touted benefits of the Web and its underlying
REST architectural style is the ability to optimise bandwidth and
response times through caching. This primarily affects GET
requests to resources, which can be intercepted by intermediaries
who understand the interaction they are seeing pass through them.
A common transport protocol that conforms to REST's constraints
is essential to the ease of implementing this performance optimisation
system.
REST's official constraints are as follows:
- Client-Server,
- Statelessness,
- Cache,
- Uniform Interface,
- Layered System,
- Code-on-demand, and less explicitly:
- Hyperlinking
Many of these constraints are second nature to users of HTTP, leading
to strange conversations about how to make a particular interaction
"more RESTful" when the constraints of REST are already met.
Client/Server
The client/server constraint requires that each component is either
a client or server, but not both.... [[expand]]
Roy Fielding has since talked publicly about how relaxing the client/server
constraint is not particularly harmful to a REST architecture.
Stateless
Stateless communication requires that the meaning of any request made
by a client is not dependent on the sequence of prior requests. This is
a difficult concept with a number of facets, but one that is central to
highly scalable systems. The principle is that each request comes through
to a server as an independent entity that can be processed by any server
in the cluster or clusters that might handle the request. HTTP achieves
this by placing all authentication and other contextual information into
each individual request, much of it being communicated in the request's
resource identifier.
To push the scalability concept beyond what REST requires, we might
conceive of two servers on different sides of the globe that answer
subsequent requests from the same client. The ideal is that any
communication required between these two servers to answer the second
request is either minimised or completely eliminated. This certainly
suggests a minimisation or elimination of sessions being used to track
particular clients as they navigate through a series of resources.
Where state is required to be stored on the server side, REST suggests
that it be addressable as a specific resource. For example, a particular
user's shopping cart should be something the client can navigate to.
A stateless design would see the entire shopping cart stored on the
client side, and resubmitted with every request that made use of the
shopping cart information. However, there are benefits to maintaining
the cart on the server side. If a server-side cart is used, clearly
some communication needs to occur between the server that handles a
request to update the cart and the server that handles a subsequent
request to view the cart.
Explicit Caching
The next constraint of REST is that of explicit caching. A large
client-intensive system such as the Web involves a great number of
data fetching operations. These operations can be optimised by
providing explicit cache guidance. Many Internet Service Providers
(ISPs) are no longer providing caches for data retrieved from
resources. As the cost of data falls it is often becoming uneconomic.
However, caching is on the rise in other areas. Client caches are
particularly important in improving interactivity as users navigate
through web sites. Likewise, caching is becoming more important in
the clusters of large web sites. Edge networks are also springing
up to efficiently move data from a web site out closer to geographically
distributed clients. Despite the apparent inefficiencies of a text-based
protocol like HTTP, its caching model achieves significant apparent
improvements both to bandwidth and latency.
Uniform Interface
The Uniform Interface constraint of REST is both central to its
success, and somewhat overstated. The principle is that every message
sent throughout the overall system should be understandable to
components that may handle the message along the way. The first of
these components is typically a HTTP library with the client program.
Subsequently, the request can flow through a series of proxies and
other intermediataries before reaching the origin server. The origin
server may itself be constructed as a HTTP parsing or handling library
plus forwarding rules until the request finally arrives at a piece of
code that fulfils the request. A similar path is taken for the returned
response.
I say the Uniform Interface constraint is central because it supports
the use of caching proxies, authentication servers, and a wide range of
other components along the way. Having a basic level of understanding
of the message is what makes all of this possible. This is a
contrast to the use of bespoke interfaces across a Corba or Web-Services
software stack. Intermediataries don't typically know what a particular
message means, and therefore can't do very much clever with the message.
Applied to its fullest extent we can achieve a system that has both
standard methods for moving data around, and a standard set of data formats
that all components understand. Consider the human Web: Almost any browser
can access almost any site without considering the version of the interface
it is trying to access, which method it should invoke, or what the type
of the response code is. This is achieved by using a data format that has
a fairly low value for machine consumers, but can be understood by human
users.
At the same time, the concept that every message is understood in
its entirety by every component in the system is certainly not correct
for a machine-to-machine or "semantic" web. It isn't always going to be
appropriate for a banking client to access a search engine resource,
nor is the client likely to understand what to do with any response
it might get back.
In enterprise practice, each resource will understand a subset of the
total set of data formats in use across the system and a subset of the
methods. At the same time, the duck typing approach used by HTTP means
that error responses can be generated for requests that are not understood.
Layering
Another of REST's constraints is layering. Again, this is pretty obvious
to anyone who has set up a proxy server. However, it is often forgotten
in bespoke interfaces. The central requirement of layering is that the
whole request is capable of being passed from one component to another
without changing its meaning or having its identifiers truncated. For
example, the domain name is not removed from a URL just because the
message has already been sent to a server responsible for the domain.
Code on Demand
Code on demand is described as an optional constraint. Essentially,
it means that servers should be able to deploy additional capabilities
into clients on the fly. For example, the server could deploy some
javascript on a web page that describes how to include parts of the
page that might be missing. A newer client that supports the inclusion
mechanism natively may safely ignore the javascript.
Hyperlinking
The final and somewhat implicit constraint of REST is hyperlinking.
It is contained in Roy's esoteric statement of "Hypermedia as the
engine of application state". Once a uniform interface is established
it stops making sense to differentiate between different types of
objects. It doesn't make sense to have separate identifier schemes.
Instead, the single scheme of the Uniform Resource Locator (URL) is
used. The URL can be used to manage and contain some of the complexity
associated with other identification schemes. Parsers typically do not
interpret URLs as data, or construct them from parts. A URL is typically
treated as a definite whole.
Conclusion
You can read more about these constraints in Roy Fielding's doctoral
thesis dissertation, the document that defines the REST architectural
style. Some of these constraints have also made their way across the wall
into Web-Services-based SOA systems.
In the end it is difficult to tease performance optimisation, network
effects, and evolvability apart to prioritise them in your architecture.
They are all crucial as the scale of your architecture increases.
These issues have all been faced before by the Web and in REST we have a
valuable roadmap to applying the features of the Web to our own
architectures.