The Web Application Description Language (WADL) has been touted as a possible description language for REST web services. So what is a REST Description Language, and does this one hit the mark?
The Uniform Interface Perspective
I have long been a fan, user, and producer of code generation tools. When I started with my current employer some seven or eight years ago, one of my first creations was a simple language that was easier to process than C++ to define serializable objects. I'm not sure I would do the same thing now, but I have gone on to use code generation and little languages to define parsers and all manner of convenient structures. It can be a great way to reduce the amount of error-prone rote coding that developers do and replace it with a simplified declarative model.
I say that I wouldn't do the serialisation of an object the same way any more. That's because I think there is a tension between model-driven approaches such as code generation and a "less code" approach. Less code approaches use clever design or architecture to reduce the amount of rote code a developer writes. Instead of developing a little language and generating code, we can often replace the code that would have been generated by simpler code. In some cases we can eliminate a concept entirely. In general, I prefer a "less code" approach over a model-driven approach. In practice, both are used and both are useful.
One of the neat things about REST architecture is that a whole class of generated code disappears. SOA assumes that we will keep inventing new protocols instead of reusing the ones we have. To this end, it introduces a little language in the form of an IDL file definition and includes tools to generate both client and server code from IDL instances. In contrast, REST fixes the set of methods in its protocols. By using clever architecture, the code we would have generated for a client or server stub can be eliminated.
In a true REST architecture, both the set of methods associated with the protocol (eg GET, PUT, DELETE) and the set of content types transferred (eg HTML, atom, jpeg) are slow-moving targets compared to the rate at which code is written to exploit these protocols. Instead of being generated, the code written to handle both content and content transfer interactions could be written by hand. Content types are the most likely targets to be fast-moving and are probably best handled using tools that map between native objects and well-defined schemas. Data mapping tools are an area of interest for the w3c.
So does this leave the WADL little language out in the cold? Is there simply no point to it?
I think that is a question that is tied to a number of sensitive variables that will depend on where you are on the curve from traditional SOA to REST nirvana. It is likely that within a single organisation you will have projects at various points. In particular, it is difficult to reach any kind of nirvana where facets of the uniform interface are in motion. This could be for a number of reasons, the most common of which is likely to be requirements changes. It is clear that the more change you are going through the more tooling you will need and benefit from in dealing with the changes.
The main requirement on a description language that suits the uniform interface as a whole is that it be good at data mapping. However this specification may or may not be the same as suits specific perspectives within the architecture.
The Server Perspective
Even if you are right at the top of the nirvana curve with a well-defined uniform interface, you will need some kind of service description document. Interface control in the REST world does not end with the Uniform Interface. It is important to be able to concisely describe the set of URLs a service provides, the kinds of interactions that it is valid to have with them, and the types of content that are viable to transfer in these interactions. It is essential that this set be well-understood by developers and agreed at all appropriate levels in the development group management hierarchy.
Such a document doesn't work without being closely-knit to code. It should be trivial from a configuration management perspective to argue that the agreed interface has been implemented as specified. This is simplest when code generated from the interface is incorporated into the software to be built. The argument should run that the agreed version of the interface generates a class or set of classes that the developer wrote code against. The compiler checks that the developer implemented all of the functions, so the interface must be fully implemented.
The tests on the specification should be:
- Does it capture the complete set and meaning of resources, including those that are constructed from dynamic or configuration data and including any query parts of urls?
- Does it capture the set of interactions that can be had with those resources, eg GET, PUT and DELETE?
- Does it capture the high-level semantic meaning of each interaction, eg PUT to the fan speed sector resource sets the new target fan speed?
- Does it capture the set of content types that can be exchanged in interactions with the resource, eg text/plain and application/calendar+xml?
- Does it defer the actual definition of interactions and content types out to other documents, or does it try to take on the problem of defining the whole uniform interface in one file? The former is a valid and useful approach. The latter could easily mislead us into anti-architectural practice.
I admit this point is a struggle for me. If we make use of REST's inherent less-code capability we don't need to generate any code. We could just define a uniform interface class for each resource to implement, and allow it to register in a namespace map so that requests are routed correctly to each resource object. This would result in less code overall, but could also disperse the responsibility for implementing the specification. If we use generated code, the responsibility could be centralised at the cost of more code overall.
The Client Code Perspective
To me, the client that only knows how to interact with one service is not a very interesting one. If the code in the client is tied solely to google, or to yahoo, or to ebay, or to amazon... well... there is nothing wrong with that. It just isn't a very interesting client. It doesn't leverage what REST's uniform interface provides for interoperability.
The interoperable client is much more interesting. It doesn't rely on the interface control document of a particular service, and certainly doesn't include code that might be generated from such a document. Instead, it is written to interact with a resource or a set of resources in particular ways. Exactly which resources it interacts with is a matter for configuration and online discovery.
An interoperable client might provide a trend graph for stock quotes. In this case it would expect to be given the url of a resource that contains its source data in the form of a standard content type. Any resource that serves data in the standard way can be interacted with. If the graph is able to deal with a user-specified stock, that stock could either be specified as the url to the underlying data or as a simple ticker code. In the former case the graph simply needs to fetch data from the specified URL and render it for display. In the latter case it needs to construct the query part of a URL and append it to the base URL it has been configured with. I have mentioned before that I think it is necessary to standardise query parts of urls if we are to support real automated clients, so no matter which web site the client is configured to point to they should interpret the url correctly.
Again we could look at this from an interface control perspective. It would be nice if we could catalogue the types of resources out there in terms of the interactions they support and with which content types. If we could likewise catalogue clients in terms of the interactions they support with which content types we might be able to interrogate which clients will work with which resources in the system. This might allow us to predict whether a particular client and server will work together, or whether further work is required to align their communication.
Everywhere it is possible to configure a URL into a client we might attempt to classify this slot in terms of the interactions the client expects to be able to have with the resource. A configuration tool could validate that the slot is configured against a compatible resource.
I have no doubt that something of this nature will be required in some environments. However, it is also clear that above this basic level of interoperability there are more important high-level questions about which clients should be directed to interact with which resources. It doesn't make sense and could be harmful to connect the output of a calculation of mortgage amortization to a resource that sets the defence condition of a country's military. Semantics must match at both the high level, and at the uniform interface level.
Whether or not this kind of detailed ball and socket resource and client cataloging makes sense for your environment will likely depend on the number of content types you have that mean essentially the same thing. If the number for each is "one" then the chances that both client and resource can engage in a compatible interaction is high whenever it is meaningful for such an interaction to take place. If you have five or ten different ways to say the same thing and most clients and resources implement only a small subset of these ways... well then you are more likely to need a catalogue. If you are considering a catalogue approach it may still be better to put your effort into rationalising your content types and interaction types instead.
The non-catalogue interoperable client doesn't impose any new requirements on a description language. It simply requires that it is possible to interact in standard ways with resources and map the retrieved data back into its own information space. A good data mapping language is all it needs.
The Client Configuration Perspective
While it should be possible to write an interoperable client without reference to a specific service's interface control document, the same cannot be said for its configuration. The configuration requires publication of relevant resources in a convenient form. This form at least needs to identify the set of resources offered by the service and the high-level semantics of interactions with the resource. If we head down the catalogue path, it may also be useful to know precisely what interactions and content types are supported by the resource.
The requirements of mass-publication format differ from those of interface control. In particular, a mass-publication of resources is unable to refer to placeholder fields that might be supplied by server-side configuration. Only placeholders that refer to knowledge shared by the configuration tool and the server can be included in the publication.
WADL
Of all these different perspectives WADL is targeted at the interface control of a service. I'm still thinking about whether or not I like it. I have had a couple of half-hearted stabs and seeing whether I could use it or not. If I were to use it, it would be to generate server-side code.
I have some specific problems with WADL. In particular, I think that it tries to take on too much. I think that the definition of a query part of a URL should be external to the main specification, as should the definition of any content type. These should be standard across the architecture, rather than be bound up solely in the WADL file. I note that content type definitions can be held in external files at least.
I'm still thinking about if and how I would do things differently. I guess I would try to start from the bottom:
- Define the interactions of the architecture. Version this file independently, or each interaction's file independently.
- Define the content types of the architecture. Version each file independently.
- Define the set of url query parts that can be filled out by clients independently of a server-provided form. Version each file independently.
- Specify a service Interface Control Document (ICD) that identifies each of the resources provided by the service. It should refer to the various aspects of the uniform interface that the resource implements, including their versions. I wouldn't try to specify request and response sections in the kind of freeform way that wadl currently allows. Version this file independently of other ICDs.
- Specify a mass-publication format. It should fill a similar role to the ICD, but be be more focused on communicating high-level semantics to configuration tools. For example, it might have tags attached to each resource for easy classification and filtering.
Conclusion
I think that discussion in the REST description language area is useful, and could be heading in the right direction. However, I think that as with any content type it needs to be very clearly focused to be successful. We have to be clear as to what we want to do with a description language, and ensure that it isn't used in ways that are anti-architectural. I'm sure we have quite a way to go with this, and that there are benefits in having a good language in play.
Benjamin