The
Web Application Description Language (WADL)
has been touted as a possible
description language for REST web services. So what is a REST Description
Language, and does this one hit the mark?
The Uniform Interface Perspective
I have long been a fan, user, and producer of code generation tools. When
I started with my current employer some seven or eight years ago, one of my
first creations was a simple language that was easier to process than C++ to
define serializable objects. I'm not sure I would do the same thing now, but
I have gone on to use code generation and little languages to define parsers
and all manner of convenient structures. It can be a great way to reduce the
amount of error-prone rote coding that developers do and replace it with
a simplified declarative model.
I say that I wouldn't do the serialisation of an object the same way any
more. That's because I think there is a tension between model-driven approaches
such as code generation and a "less code" approach. Less code approaches use
clever design or architecture to reduce the amount of rote code a developer
writes. Instead of developing a little language and generating code, we can
often replace the code that would have been generated by simpler code. In
some cases we can eliminate a concept entirely. In general, I prefer a
"less code" approach over a model-driven approach. In practice, both are used
and both are useful.
One of the neat things about REST architecture is that a whole class of
generated code disappears. SOA assumes that we will keep inventing new
protocols instead of reusing the ones we have. To this end, it introduces a
little language in the form of an IDL file definition and includes tools to
generate both client and server code from IDL instances. In contrast, REST
fixes the set of methods in its protocols. By using clever architecture, the
code we would have generated for a client or server stub can be eliminated.
In a true REST architecture, both the set of methods associated with the
protocol (eg GET, PUT, DELETE) and the set of content types transferred
(eg HTML, atom, jpeg) are slow-moving targets compared to the rate at which
code is written to exploit these protocols. Instead of being generated, the
code written to handle both content and content transfer interactions could be
written by hand. Content types are the most likely targets to be fast-moving
and are probably best handled using tools that map between native objects and
well-defined schemas. Data mapping tools are an area of interest for the w3c.
So does this leave the WADL little language out in the cold? Is there simply
no point to it?
I think that is a question that is tied to a number of sensitive variables
that will depend on where you are on the curve from traditional SOA to REST
nirvana. It is likely that within a single organisation you will have projects
at various points. In particular, it is difficult to reach any kind of nirvana
where facets of the uniform interface are in motion. This could be for a number
of reasons, the most common of which is likely to be requirements changes.
It is clear that the more change you are going through the more tooling you
will need and benefit from in dealing with the changes.
The main requirement on a description language that suits the uniform
interface as a whole is that it be good at data mapping. However this
specification may or may not be the same as suits specific perspectives within
the architecture.
The Server Perspective
Even if you are right at the top of the nirvana curve with a well-defined
uniform interface, you will need some kind of service description document.
Interface control in the REST world does not end with the Uniform Interface.
It is important to be able to concisely describe the set of URLs a service
provides, the kinds of interactions that it is valid to have with them, and
the types of content that are viable to transfer in these interactions. It is
essential that this set be well-understood by developers and agreed at all
appropriate levels in the development group management hierarchy.
Such a document doesn't work without being closely-knit to code. It should
be trivial from a configuration management perspective to argue that the agreed
interface has been implemented as specified. This is simplest when code
generated from the interface is incorporated into the software to be built.
The argument should run that the agreed version of the interface generates a
class or set of classes that the developer wrote code against. The compiler
checks that the developer implemented all of the functions, so the interface
must be fully implemented.
The tests on the specification should be:
- Does it capture the complete set and meaning of resources, including those
that are constructed from dynamic or configuration data and including any query
parts of urls?
- Does it capture the set of interactions that can be had with those
resources, eg GET, PUT and DELETE?
- Does it capture the high-level semantic meaning of each interaction,
eg PUT to the fan speed sector resource sets the new target fan speed?
- Does it capture the set of content types that can be exchanged in
interactions with the resource, eg text/plain and application/calendar+xml?
- Does it defer the actual definition of interactions and content types out
to other documents, or does it try to take on the problem of defining the whole
uniform interface in one file? The former is a valid and useful approach. The
latter could easily mislead us into anti-architectural practice.
I admit this point is a struggle for me. If we make use of REST's inherent
less-code capability we don't need to generate any code. We could just define
a uniform interface class for each resource to implement, and allow it to
register in a namespace map so that requests are routed correctly to each
resource object. This would result in less code overall, but could also
disperse the responsibility for implementing the specification. If we use
generated code, the responsibility could be centralised at the cost of more
code overall.
The Client Code Perspective
To me, the client that only knows how to interact with one service is not
a very interesting one. If the code in the client is tied solely to google, or
to yahoo, or to ebay, or to amazon... well... there is nothing wrong with that.
It just isn't a very interesting client. It doesn't leverage what REST's
uniform interface provides for interoperability.
The interoperable client is much more interesting. It doesn't rely on the
interface control document of a particular service, and certainly doesn't
include code that might be generated from such a document. Instead, it is
written to interact with a resource or a set of resources in particular ways.
Exactly which resources it interacts with is a matter for configuration and
online discovery.
An interoperable client might provide a trend graph for stock quotes. In
this case it would expect to be given the url of a resource that contains its
source data in the form of a standard content type. Any resource that serves
data in the standard way can be interacted with. If the graph is able to deal
with a user-specified stock, that stock could either be specified as the url
to the underlying data or as a simple ticker code. In the former case the
graph simply needs to fetch data from the specified URL and render it for
display. In the latter case it needs to construct the query part of a URL and
append it to the base URL it has been configured with. I have mentioned before
that I think it is necessary to standardise query parts of urls if we are to
support real automated clients, so no matter which web site the client is
configured to point to they should interpret the url correctly.
Again we could look at this from an interface control perspective. It would
be nice if we could catalogue the types of resources out there in terms of
the interactions they support and with which content types. If we could
likewise catalogue clients in terms of the interactions they support with
which content types we might be able to interrogate which clients will work
with which resources in the system. This might allow us to predict whether a
particular client and server will work together, or whether further work is
required to align their communication.
Everywhere it is possible to configure a URL into a client we might attempt
to classify this slot in terms of the interactions the client expects to be
able to have with the resource. A configuration tool could validate that the
slot is configured against a compatible resource.
I have no doubt that something of this nature will be required in some
environments. However, it is also clear that above this basic level of
interoperability there are more important high-level questions about which
clients should be directed to interact with which resources. It doesn't make
sense and could be harmful to connect the output of a calculation of mortgage
amortization to a resource that sets the defence condition of a country's
military. Semantics must match at both the high level, and at the uniform
interface level.
Whether or not this kind of detailed ball and socket resource and client
cataloging makes sense for your environment will likely depend on the number
of content types you have that mean essentially the same thing. If the number
for each is "one" then the chances that both client and resource can engage
in a compatible interaction is high whenever it is meaningful for such an
interaction to take place. If you have five or ten different ways to say the
same thing and most clients and resources implement only a small subset of
these ways... well then you are more likely to need a catalogue. If you are
considering a catalogue approach it may still be better to put your effort into
rationalising your content types and interaction types instead.
The non-catalogue interoperable client doesn't impose any new requirements
on a description language. It simply requires that it is possible to interact
in standard ways with resources and map the retrieved data back into its own
information space. A good data mapping language is all it needs.
The Client Configuration Perspective
While it should be possible to write an interoperable client without
reference to a specific service's interface control document, the same cannot
be said for its configuration. The configuration requires publication of
relevant resources in a convenient form. This form at least needs to identify
the set of resources offered by the service and the high-level semantics of
interactions with the resource. If we head down the catalogue path, it may also
be useful to know precisely what interactions and content types are supported
by the resource.
The requirements of mass-publication format differ from those of interface
control. In particular, a mass-publication of resources is unable to refer to
placeholder fields that might be supplied by server-side configuration. Only
placeholders that refer to knowledge shared by the configuration tool and the
server can be included in the publication.
WADL
Of all these different perspectives WADL is targeted at the interface
control of a service. I'm still thinking about whether or not I like it. I
have had a couple of half-hearted stabs and seeing whether I could use it or
not. If I were to use it, it would be to generate server-side code.
I have some specific problems with WADL. In particular, I think that it
tries to take on too much. I think that the definition of a query part of
a URL should be external to the main specification, as should the definition
of any content type. These should be standard across the architecture, rather
than be bound up solely in the WADL file. I note that content type definitions
can be held in external files at least.
I'm still thinking about if and how I would do things differently. I guess
I would try to start from the bottom:
- Define the interactions of the architecture. Version this file independently, or each interaction's file independently.
- Define the content types of the architecture. Version each file independently.
- Define the set of url query parts that can be filled out by clients
independently of a server-provided form. Version each file independently.
- Specify a service Interface Control Document (ICD) that identifies each of
the resources provided by the service. It should refer to the various aspects of
the uniform interface that the resource implements, including their versions.
I wouldn't try to specify request and response sections in the kind of freeform
way that wadl currently allows.
Version this file independently of other ICDs.
- Specify a mass-publication format. It should fill a similar role to the
ICD, but be be more focused on communicating high-level semantics to
configuration tools. For example, it might have tags attached to each resource
for easy classification and filtering.
Conclusion
I think that discussion in the REST description language area is useful, and
could be heading in the right direction. However, I think that as with any
content type it needs to be very clearly focused to be successful. We have to
be clear as to what we want to do with a description language, and ensure that
it isn't used in ways that are anti-architectural. I'm sure we have quite a way
to go with this, and that there are benefits in having a good language in play.
Benjamin