Sound advice - blog

Tales from the homeworld

My current feeds

Mon, 2006-May-29

Moving Towards REST - a case study

I mentioned in my last entry that I developed my own object-oriented IPC system some years ago, and have been paying my penance since. The system had some important properties for my industry, and a great deal of code was developed around it. It isn't something you can just switch off, and isn't something you can easily replace. So how is that going? What am I doing with it?

I am lucky in my system that I am working with a HMI quite decoupled from the server side processes. The HMI is defined in terms of a set of paths that refer to back-end data, and that data is delivered and updated to the HMI as it changes. To service this HMI I developed two main interfaces. There is a query/subscribe interface and a command interface. These both work based on the path structure, so in a way I was already half-way to REST when I started to understand the importance of the approach. Now, I can't just introduce HTTP as a means of getting data around. HTTP is something the organisation has not yet had a lot of experience with, and concerns over how it will perform are a major factor. The main concern, though, is integration with our communications management system. This system informs clients of who they should be communicating with, when. It tells them which of their redundant networks to use, and it tells them how long to keep trying.

A factor we consider very carefully in the communications architecture of our systems is of how they will behave under stressful situations. We need clients to stop communicating with dead equipment in a short period of time, however we also expect that a horrendously loaded system will continue to perform basic functions. If you have been following the theoretical discussions I have had on this blog over the last few years you'll understand that these requirements are in conflict. If A sends a message to B, and B has not responded within time t, is B dead or just loaded? Should A fail over to B's backup, or should it keep waiting?

We solve this problem by periodically testing the state of B via a management port. If the management port fails to respond, failover is initiated. If the port conintues to operate, A keeps waiting. We make sure that throughout the network no more pings are sent than are absolutely required, and we ensure that the management port always responds quickly irrespective of loading. Overall this leads to a simple picture, at least until you want to try and extend your service guarantees to some other system.

So, for starters they don't understand your protocols. If they did understand them (say you offered a HTTP interface) you would have to also add support for accessing your management interfaces. Their HTTP libraries probably won't support that. So you pretty much have to live with request timeouts. Loaded systems could lead to the timeouts expiring and to failovers increasing system load. Oh well.

So the first step is definately not to jump to HTTP. Step number one is to create a model of HTTP within the type system we have drawn up. We define an interface with a call "request". It accepts a "method", "headers", and "body" parameter list with identical semantics to those of HTTP. Thus, we can achieve the first decoupling benefit of actual HTTP. We can decouple protocol and document type, and begin to define new document types without changing protocol.

I changed requests over our command interface to now go over our mock HTTP. This means it will be straightforward in the future to plug actual HTTP into our applications either directly or as a front-end process for other systems to access. I added an extended interface to objects that recieve commands now so that they can have full access to the underlying mock or actual HTTP request if they so chose. They will be able to handle multiple content types by checking the content-type header. Since the change, our objects are not tied to our procotol. Their main concern is of document type and content, as well as the request method that indicates what a client wants done with the document. We can change protocol as needed to support our ongoing requirements.

Step two is to decouple name resolution from protocol. We had already done that effectivley in our system. Messages are not routed through a central process. Connections are made from point to point. Any routing is done at the IP level only. Easy. So we connect our name system to DNS and other standard name resolution mechanisms. We start providing information through our management system not only of services under our management, but also of services under DNS management only. The intention is that over time the two systems are brought closer and closer together. One day we will have only one domain name system, and we have a little while between then and now to think about how that unified system will relate to our current communications management models.

Alongside these changes we begin bringing in URL capabilities, and mapping our paths onto the URL space. We look up the authority through our management system, and pass the path on to whomever we are directed to connect to. Great! We can even put DNS names in, which is especially useful when we want to direct a client to speak to localhost. Localhost does not require a management system, which is what makes IPC simpler than a distributed comms system. There is no hardware to fail that doesn't affect us both. We can direct our clients to look at a service called "foo.bar", or use the same configuration file to direct our client to "localhost:1234". The extended support comes for free on the client side.

As the cumulative benefits of working within a RESTful model start to pile up, we are moving the functionality of other interfaces onto the command interface. As more functionality is exposed, more processes can get at that functionality easily without having to write extra code to do so. That is lesscode at its finest. Instead of building complex type-strict libraries for dealing with network communications, we just agree on something simple everywhere. We don't need to define a lot of types and interfaces. We just need the one. Based on an architectural decision, we have been able to get more cross-domain functionality for less work.

So, what is next? I am not a beliver in making architectural changes for the sake of making them. I do not think that polishing the bed knobs is a valuable way for a software developer to spend his or her time. We must deliver functionality, and where the price of doing things right is the same or cheaper than the price of doing things easily we take the opportunity to make things better. We take the opportunity to make the next piece of work cheaper and easier too. Over time I hope to move more and more functionality to the command interface. I hope to add that HTTP front-end, and perhaps integrate it into the core in the near to medium term future. I especially hope to provide simple mechanisms for our system to communicate with other systems using an architecture based on document transfer. Subscription will be in there somewhere, too.

The challenge going forward will be riding that balance between maintaining our service obligations, and making things simple to work with and standard. The obligations offered in my industry are quite different to those offered on the web, so hard decisions will need to be made. Overall, the change from proprietary to open and from object-oriented to RESTful will make those challenges worth overcoming.

Benjamin