Sound advice - blog

Tales from the homeworld

My current feeds

Sun, 2007-Mar-04

You are already doing REST

REST is often strongly correlated with HTTP and its verbs. It is often contrasted with a SOAP or WS-* services as two opposing technologies or approaches. I take more of a middle ground. I think that you are already doing REST. The fundamental questions in the development of your network architecture are not necessarily whether or not you should be doing REST, but specifically what benefits do you intend to extract from the development. Let me run you through how I see this thing playing out.

Your messages are already uniform

Working from first principles with the constraints, REST says that you should exchange a constrained set of message types using standard methods and content types. So you have your IDL file or your WSDL, and you have a number of methods listed in that file. If you are using the document-oriented style of SOA your WSDL will no doubt include or refer to the definition of a set of documents. In other words, your WSDL defines the scope of your web. Everything in that web... everything that implements the WSDL either as a client or as a server... can meaningfully exchange messages. These components of your architecture can be configured together. They don't need to be coded. A human can decide to plug one to the other arbitrarily without the technology getting in the way.

But the technology is getting in the way.

Your uniform methods aren't achieving network effects

You have defined this web, this WSDL, this architecture... but it is too specific. You can only connect the two components together that you designed the interface for, or you can only connect the client apps to the server that you designed the interface for. It isn't a general mechanism for letting a client and server talk to each other, because the problems of that particular interaction are built into your web in a fundamental way that makes solving other problems difficult.

That's ok, isn't it? If I want to solve other problems I can create another WSDL. I can create another web. Right?

You can, and sometimes that is the right approach. However you impose a cost whenever you do that. You can only plug components together if they are on the same web. You can only plug them together if they share a WSDL. Otherwise you have to code them together. Most of us have been writing code whenever we want two network components to talk to each other for so long that we assume there is no alternative. However, I come from the SCADA world and an increasing number of competent people come from the Web world. Experience in both of these worlds suggests we can do better. But how much better, exactly?

In an ideal world...

The ultimate ideal would be that so long as two machines have the same basic data schema and any particular interaction makes sense, that they can be configured to engage in that interaction rather that requiring us to write special code to make that interaction happen. However, is this practical? What is achievable in practice?

The Web sets the benchmark by defining separately the set of interactions machines participate in and the et of document types they can exchange. The three components of what make up our messaging change and evolve at different rates, so separating them is an important part of solving each of these important problems.

  1. How we identify participants in an interaction, especially
    • request and response targets
  2. What interactions are required, including
    • Request Methods
    • Response Codes
    • Headers
    • Transport Protocol
    • TCP/IP Connection direction
  3. How information is represented in these interactions, including
    • Semantics
    • Vocabulary
    • Document structure
    • Encoding (eg XML), or Serialisation (eg RDF/XML)

Whether or not you can actually achieve consensus on all of these points is a difficult question, and usually limited by non-technical issues. You really need to hit an economic sweet spot to achieve wide-scale consensus on any part of the trio. Luckily, consensus on identification and interactions is widely achieved for general-purpose problem sets. Special uses may need special technology, but URLs and HTTP go a very long way to closing out this triangle of message definition. The remaining facet is perhaps the hardest because it requires that we enumerate all of the different kinds of information that machines might need to send to each other and have everyone in the world agree to that enumeration.

So this is the limiting factor of the development of the Semantic Web, a web in which rich semantics are conveyed in messages that are understood by arbitrary components without having to write special code. The limiting factor is the number of kinds of information you can achieve global consensus on. However, we don't really need to have global consensus for our special problems. We only need consensus within our little web. We just need to understand the messages being exchanged in a particular enterprise, or a particular department, or a particular piece of IT infrastructure. We just need the components of our web to understand.

Closing the triangle: Content Types

So if the web relies on special understanding of this final facet, what is the point of agreeing on the first two? The answer to that, my friends, is evolution. What is special today might not be tomorrow. I can develop a format like atom to solve a specific set of problems within my web, and then promote my format to a wider audience. The more components that implement the document type, the wider my web and the bigger the network effects. The other two facets already have widespread consensus, so I can target my efforts. I can avoid reinventing solutions to how objects are named or how we interact with them. I can just focus on the format of data exchanged in those GET, PUT, and POST requests. The rest is already understood and known to work.

Now that's all well and good. The Semantic Web will evolve through solutions to specific problems being promoted until individual webs that solve these problems are joined by thousands of components operated by thousands of agencies. But... what about me? What about today's problems? Most of my document types will never leave the corporate firewall, so is there still and advantage in considering the Web's decomposition of message type?

I suggest, "yes". Whenever you integrate a new component into your network, do you need to write code? When new protocols are defined, are the easy to come to consensus on? As an integrator of components myself I find it useful to be able to fall back on the facets of message type that are widely agreed upon when new protocols are being defined. We don't have to go over all of that old ground. You and I both know what we mean when we say "HTTP GET". Now we just have to agree on specific urls and on content types. Chances are that I have a content type in my back pocket from similar previous integrations or that I can adapt something that is in use on the wider Web. Any message exchanges that could use pure Web messaging does so, and anything that needs special treatment gets as little special treatment as possible.

Certainly, after a few years of doing this kind of work it gets easier to build bridges between software components.

Vendors and Standards

Unfortunately, this sort of gradual evolution and interactions between the wider webs and your special case are not well supported by WS-*. Hrrm... this is where I find it hard to continue the essay. I really don't know WS-* well enough to make definitive statements about this. What I can do with HTTP is easily add new methods within the namespace of the original methods. I can then promote my new methods for wider use, so for example I can promote a subscription mechanism. In XML I could add extensions to atom, and if I used the atom namespace for my extensions they could eventually be adopted into atom proper without breaking legacy implementations of the extensions. Can the same be said for WS-*? Does it allow me to separate the definitions of my methods and my document types effectively for separate versioning? Do the tools support or fight these uses of WS-*?

For that matter, do the tools associated with XML encourage must-ignore semantics that allow for gradual evolution? Do they encourage the use of a single namespace of extensions, with alternative namespaces only used for separately-versioned sub-document types such as xhtml within atom? My tools do, but they are all written with the architectural properties I require in mind. Does the world and do vendors really understand this communication problem? Do they understand the importance of limiting the number of kinds of messages that exist in the world? Are they taking practical steps to make it easier to reuse existing messages and message structure than to create incompatible infrastructure? Do programmers understand the social responsbility that defining new message types places on them?

Simplicity: Architecture or toolchains?

REST is sometimes bandied about as a simpler approach than WS-*, and certainly REST architectures are simpler. They have less variation in message types, and promote configuration of components rather than coding to make new interactions work. However REST only achieves this by shifting the social problem out of the software. Instead of solving the problem of how two components interact with one-off software, we have a long multi-party standardisation effort that ensures thousands can interact in this way. REST encourages the reuse of existing messages and structure, but in truth it is often easier to add a new message type simply by throwing an extra method into your WSDL. REST results in less code and simpler architectures. SOA results in more code and more complex architectures... but the difference isn't between HTTP and SOAP. It is between investment in an architecture and investment in a toolchain.

Perhaps that is the take-home, ladies and gentlemen: You can achieve better and simpler and higher-value architectures... but even when you leverage standard identification schemes and interaction models there is no silver bullet. You still need to choose or evolve or invent your document types wisely. That costs time and money, as does promoting anything you do invent to sufficient scale to achieve significant value. That effort has to be judged against the value of the network to you and your company. On the other hand, I think we are lacking the tools that make some of these problem easy to identify. I think we can make it easier to build new infrastructure based on the consensus we have already achieved. I think we can do better.

Conclusion

You are already doing REST, but are you getting the network effects that you see on the wider Web? Can a million different software components carry on perfectly sensible conversations with others that they have never met before and had no special code written for? Can you do REST better? Is it worth the extra effort, and what tooling do we need to put the evolvable Semantic Web within the reach of mere mortals?

Benjamin