Sound advice - blog

Tales from the homeworld

My current feeds

Sun, 2007-Dec-09

Revising HTTP

Mark Nottingham is talking about an effort to revise HTTP. I have been thinking about and working with HTTP for machine-to-machine communications for a while, now. This group's charter includes the phrase "must not introduce a new version of HTTP and should not add new functionality to HTTP". If changes to HTTP were possible, my wishlist would include improvements to reliable messaging and working in a high availability environment.

Reliable Messaging

Reliable messaging is about getting your message through intact over an unreliable network. WS-* takes the approach of implementing TCP over TCP. Sequence numbers are added to messages such that they can arrive out of order or multiple times without being processed out of order or multiple times.

HTTP takes a better approach. It uses idempotent messages, so any that arrive multiple times have the same effect as having arrived only once. This is cheaper and more scalable than the WS-* approach. WS-* requires different servers to communicate their place in the sequence to each other as the stream moves from server to server. Any server-to-server communication at this level is an impediment to scalability, especially in a high-availability environment.

The thing HTTP misses out on is reliable ordering of messages. If I make requests down separate TCP/IP connections there is no way to achieve a reliable ordering. Unfortunately, if I choose to make requests down a persistent connection my guarantees are no better. In an ideal world I would be able to send a PUT https://example.com/ 1, then a fraction of a second later decide to send a PUT https://example.com/ 2. Without an ordering guarantee I have to wait for the first request to come back with a response.

The first problem comes with proxies. They are free to take pipelined requests from a single TCP connection and forward them on across different connections. The second problem comes with a multi-threaded server that under extreme conditions could acquire locks for the second requests before they acquire locks for the first request. If so, I could be left with a value of "1" at my URL, rather than the "2" I intended.

The whole thing gets complicated when caching is included, and PUT or DELETE requests are mixed with GET requests. A GET request might not even make it past any given proxy, so talking about reliable ordering of that request and others to the origin server is not a sensible conversation. If we talk about GET requests which we know will make it to the origin server (eg no-cache requests) then things are probably ok.

High availability clients

I wrote a little while ago about using TCP keepalives to allow a client to check for a failed server and fail over to a non-failed server in a bounded time. This hack can be fairly effective, but is only required because HTTP must return responses in the order requests were made. A special keep-alive message that can be sent and responded to while other requests are outstanding would allow a fast client failover.

This could be achieved in a number of ways: The TCP-keepalive solution could be recognised officially, a special keepalive for HTTP with special response characteristics could be introduced, or a general mechanism for out-of-order responses could be used.

Conclusion

I have only covered a couple of pain points for me at the moment. Both have work-arounds. The issue with reliable messaging means that responses have to come back before sending another request. It probably isn't the worst thing in the world that this work-around needs to be invoked. Likewise, the TCP keepalive solution to high availability clients is an effective solution in controlled environments.

I haven't covered areas like security and caching, as there are a lot of areas that I haven't thought through sufficiently. Publish/subscribe is also a pain point of HTTP, but is a separate protocol development in its own right.

There is probably little that this group can effectively achieve. HTTP itself is difficult to do anything with, as we don't want to break backwards-compatibility with existing implementations. Perhaps the best that can be done is to put problems like these and their work-arounds in as clear a print as possible.

Benjamin

Wed, 2007-Nov-28

High Availability HTTP

HTTP and its underlying protocols have some nice High Availability features built in. Many of these features come for free when load balancing or scalability improvements are made. In the enterprise our scalability needs are often smaller than a large web site, but our high availability requirements are more stringent. HTTP can be combined with part-time TCP keepalives to work effectively in this environment.

Web vs Enterprise

A large web site can depend on it users to hit reload under some failure conditions. An enterprise system with a significant number of automated clients needs more explicit guidance. The systems I work with tend to require no single point of failure, and an explicit bounded failover time. Say we have four seconds to work with: A request/response protocol needs to be able to determine within 4s whether or not it will receive a response. If a client determines that it will not receive a response, it will repeat its idempotent request to a backup server.

The mechanism HTTP provides to detect a failed server is the request timeout. Say we have a time-out of forty seconds. If 40s passes without a response the client knows that the server is either:

  1. that the server has failed, or
  2. that the server is taking an unusually long time to respond.

If we tighten the failure detection window to only 4s this choice becomes more stark. A better approach would be to send heartbeats to the server while our request is outstanding. However, the server is not permitted to respond to HTTP requests out of order. The requests we might send cannot be replied to unless we drop below the HTTP protocol layer. Luckily, TCP gives us an out in this situation.

Using TCP Keepalive

The TCP Keepalive mechanism essentially sends zero bytes of traffic to the server, requiring an acknowledgement at the TCP level. This isn't enough to detect all forms of failure, but will detect anything that causes the TCP connection to terminate.

The interesting thing is that these keepalive probes don't need to add a lot of overhead. A traditional heartbeating system would be active all the time. The TCP keepalive need only be enabled while one or more requests are outstanding on the connection. It should be disabled while the connection is idle. Even when enabled, heartbeats will only be sent when:

  1. A request takes longer than the heartbeat time, and
  2. No other requests are being transmitted down the connection

This system of heartbeats really needs to be augmented by local detection on the server side of failures that the client can't detect. For this reason it may still be useful for the client to time requests out eventually. However this then becomes a back-stop that doesn't need to have such stringent requirements on it.

Connecting quickly is still important both after a failure and while a particular server is failed. The HTTP implementation should create simultaneous TCP/IP connections whenever the dns name associated with a URL resolves to multiple IP addresses. The first to respond should typically be the connection used to make requests.

Conclusion

It is important to note that this kind of failure detection is required even when a HA cluster is used. TCP/IP connections typically don't fail over as part of the cluster. Adding TCP keepalives that are enabled only while requests are outstanding and reconnecting quickly adds minimal overhead to achieve a 90% HA solution. This solution can be augmented on the server side with local health monitoring to complete the solution.

Benjamin

Mon, 2007-Nov-19

"Four Verbs Should be Enough for Anyone"

The classic four verbs of REST are GET, PUT, POST and DELETE. Whenever the verbs come up it is inevitable that someone will say that it is "obvious" that four verbs aren't enough. I use four basic request verbs, but I don't use POST. It isn't idempotent, so makes efficient reliable messaging difficult or non-scalable. I'm actually not even that big a fan of DELETE, which I see as simply an idiomatic "PUT null".

The point of any messaging in a distributed environment is to transfer information from one place to another. In a high-level design document we can draw flows of data from one place to another without worrying about the details. When it comes time for the detailed implementation a few additional questions need answering:

  1. Which side should contain the configuration information relating to the data flow? This information is held on the client side, and it is the responsibility of the client to ensure the transfer is successful.
  2. Which side is the data being transferred to (Where)?
  3. Which side knows when the data needs to be transferred (When)?
  4. Is the data null?

The correct method to use can be extracted from the following table

Where When Is null Method
Client Client * GET
Server Client No PUT
Server Client Yes DELETE
Client Server * SUBSCRIBE
Server Server * None - Swap Client/Server

SUBSCRIBE is sadly missing from popular specification an implementation at this stage. It is a hard problem with some delicate balancing acts, and a real-time focus. Getting it to work where events are being generated at a rate faster than the network can process them is unsolved in public implementations and specifications.

Other request methods and variations on these methods exist for a number of reasons. Some are for greater efficiency (HEAD, GET if-not-modified, etc). Others are there to deal with non-REST legacy requests or requests that take information not in the request into account (MOVE, COPY, etc).

Two other things are needed after you establish which method you want to use. You need to pick the document type you are transferring, and the URL the client interacts with. The media type should be the simplest, most standard type that conveys the necessary semantics. The fewer semantics you transfer the better, as coupling is reduced. Shared semantics are both the fundamental requirement of machine to machine communication, and its downfall. The less any particular machine knows about the information passing through it, the better. If you can get away with plain text, or a bit of HTML you are laughing.

Picking the right URL is still something of an art, but bear in mind the basic principle: Whichever method you use should make sense for the URL you define. The URL should "name" a some state on the server side that can be sampled or updated as an atomic transaction.

Benjamin

Sat, 2007-Nov-17

REST vs (T)SOA

I was just watching a video of Stefan Tilkov at the BeJUG SOA conference. I have seen most of this material before, but this time I wanted to comment on slide 31.

The original slide compares REST to "Technical" SOA ((T)SOA) by placing two SOA-style interface definitions beside five URLs conforming to a uniform interface. One implication that could be drawn from this diagram is that REST fundamentally changes the structure of the architecture. My view is that the change isn't fundamental. I see REST as simply tweaking the interface to achieve a specific set of properties.

Following is my diagram. Apologies for its crudeness, I don't have my regular tools at hand:

REST vs (T)SOA

Some differences to Stefan's model:

Separate domain names

In the business I am in we might use the word "subsystem" instead of "service", taking a military-style systems engineering approach. The client would also be or be part of a subsystem. It is useful to be able to define and control the interface between subsystems separately to the definition and control of interfaces within each subsystem. Stephan puts the URLs for the two services under one authority, but I use a separate authority for each service/subsystem (orders.example.com and customers.example.com). The definition of these URL-spaces would be controlled and evolve separately over time.

Safe and Idempotent methods

I use only safe and idempotent methods, meaning that I have reliable messaging built in: A client retries requests that time out. Reliable messaging is critical to automated software agents. Idempotency provides the simplest, most reliable, and most scalable approach. Note that for automated clients this may mean IDs have to be chosen on the client side. This has some obvious and non-obvious "cons".

HTTP introduces some special difficulties when it comes to reliable ordering of messages, so automated HTTP clients should ensure they don't have different PUT requests outstanding to the same URL at the same time.

Query part of the URL

I use the query part of a URL whenever I expect an automated client to insert parameters as part of a URL. I know that there is a move to do this with URI templates, but I personally view the query part of the URL and its use as a good feature. It helps highlight the part of the URL that needs special knowledge somewhere in client code. Opaque URLs can be passed around without special knowledge, but where a client constructs a URL it first needs to know how. This is especially important for automated clients who don't have a user to help them supply data to a form.

Don't supply every method

I don't provide all valid methods on every URL. Obviously, these are really responded to in practice. If the client requests a DELETE on a URL that doesn't allow it, the request will be rejected with an appropriate error. However, I don't want to complicate the architectural description with these additional no-op methods. Nor do I want developers or architects to feel that they have to provide functions that are not required. It should always be easy to describe what you would expect a GET to the /allorders url to mean, but that doesn't mean we actually need to provide it when we don't expect any client to issue the request.

Conclusion

REST doesn't have to redraw the boundaries of your services or your subsystems. It is a technology that improves interoperability and evolvability over time. It is worth doing because of the short term and long term cost savings and synergies. It provides a shorter path to translating your high-level data-flow diagrams into working code, and should ultimately reduce your time to market and improve your business agility. That said: It needn't erode your existing investments, and from the high level isn't really a big change. In the end, the same business logic will be invoked within the same clients and services.

Benjamin

Tue, 2007-Oct-30

XML Semantic Web

Quick link to the main RESTwiki page

I'm having a go at defining some conventions suitable for developing a . The existing "non-semantic" web is already a first stage Semantic Web. We are able to exchange semantics about document structure, and enough other basic things for humans to be able to drive it. The next stage of a Semantic Web is to produce a small set of widely-accepted schemas for conveying generally-applicable information, along the lines of what Microformats are doing in HTML. Perhaps the microformats level is a genuine next step in and of itself: People taking machines along for the ride on the Web. After that, we want to start to see machines operating more autonomously over time. I see RDF has having failed to deliver in this area, with no prospect of succeeding.

I see the failure of RDF as two-fold:

  1. The number of xml namespaces in a typical RDF document adds complexity disproportionate to its benefits, and limits independent evolution and extension of schemas
  2. The fluid document structure (especially in the XML representation) makes understanding, transformation, copy-and-paste, and a number of other beneficial activities significantly more difficult than with plain XML document types.

For some time I have felt that well-defined XML document types are superior to similarly-defined RDF schemas. I have started writing up a set of conventions for well-structured XML documents. I think these conventions yield may of the benefits that RDF is designed to bring about, but also respect the lessons learned from existing Web document types.

Have at it, let me know what you think. I'll try to get back to it soon to endorse a (very) few types that I find useful in day-to-day operations.

Benjamin

Thu, 2007-Oct-18

4+1 View Software Architecture Description

A key finding of IEEE-Std-1471-2000 "Recommended Practice for Architectural Description of Software-Intensive Systems" is that architectural descriptions should be broken into views. One approach to defining which views a typical project should use is Philippe Kruchten's Architecture.

For an approach of description that has been around for 12 years and promoted by Rational, there is a surprisingly small amount of material available on the public Internet. Most is vendor-specific, and attempts to bend the description to what can be achieved in a particular tool. I am using StarUML and attempting to apply the approach with UML diagrams to a largely RESTful architecture. This article documents my developing approach and understanding.

The example architecture I'll be describing includes a browser client that sends requests through a proxy to a server. This server itself accesses data from another server.

The Logical View

This view is the main one impacted by REST style. The diagram would be accurate but somewhat unhelpful if we were literally to describe the uniform interface as single interface class. Instead, I have been preferring to name specific URL templates as separate interfaces:

Logical View

In this view the browser attempts to access one of the https://reports.example.com/{user}/{portfolio} URLs. These URLs all support a GET operation that returns xhtml content, which the browser understands. Ultimately, the source of data is accounts.example.com. The format at this end of the architecture is application/accounts+xml, a format that the browser doesn't necessarily understand.

Moving down from the browser through the URLs it accesses we see the proxy server that initially handles the request. In order to do so it accesses the same URL, but this time directs its requests to the resource's origin server: reports.example.com.

Reports.example.com isn't the ultimate source of data in answering this request. It translates the result of a request for https://accounts.example.com/{user}/{portfolio} to accounts.example.com.

I have drawn reports.example.com and accounts.example.com separately to the web server software that supports them. The reason for this appears in the trace to the development view.

Development View

This view shows Software Configuration Items as they appear in the factory. In other words, as they appear in the development environment's Configuration Management system. Each Software Module identified in the logical view appears as a resident to one of these Configuration Items.

Configuration Items are separately-versioned entities that identify specific versions of a source tree. They also map into the deployment view: Each CI should map directly to a single installation package:

Development View

There are no dependencies shown in this example of a development view. I have been drawing dependencies in this view when the dependency is a build-time one. However, different kinds of dependencies could be added to show package dependencies. Half the fun of using tools to model these different views is to allow those tools to automatically validate consistency and tracing between the different views.

Each Software Configuration Item is deployed as part of a Hardware Configuration Item in the final system.

Physical or Deployment View

In this view we can see the Software Configuration Items deployed as part of Hardware Configuration Items. Physical connections are shown to an appropriate level, which will differ depending on whether or not the detailed hardware architecture is being maintained elsewhere.

A full trace to this view can be useful in identifying missing Software Configuration Items an Software Modules, especially those relating to configuration of specific network components.

Deployment View

An interesting feature of this physical view is that the accounts server is duplicated as a cluster. This kind of duplication is theoretically captured in the process view, but this is trickier than it may appear.

Process View

The trace through different Configuration Item views is fairly easy to capture. Configuration Items are pretty close to physical. The process view is more logical. In Philippe's original paper he shows software modules mapping onto processing threads. These threads trace to the deployment view, just as the development view does.

UML doesn't really have a way of capturing this kind of mapping. Most UML tool vendors will tell you to use sequence diagrams and the like:

View Accounts

While this helps describe a sequence of events through the logical view, it doesn't describe the redundant nature of accounts.example.com. It also fails to capture answers to a number of other questions: How do you handle flow control? How do you handle blocking, threading, connection pooling, and any number of other issues? Sometimes these spaces will be constrained by a software package you are using, or a uniform interface you are dealing with. Other times it will be important to specify these details to developers.

Scenarios or Use Case View

Scenarios capture the motivation for the architecture. In this case our motivation is pretty simple:

Scenarios View

This is the +1 view, redundant once other design decisions have been made. This might be a full use case specification, or just a bunch of bubbles. Again, this depends on whether you are maintaining a separate documentation set to cover these details or not. I haven't settled on a good way to relate this view with the other views as yet, though connecting to sequence diagrams in the process view may be a reasonable approach.

Conclusion

I think the 4+1 approach has merit, especially through the logical->development->deployment trace. However, this trace isn't unique to 4+1. It may carry its weight better if we had a better way of dealing with the process view than those provided by current tooling and theory. Including the scenarios view is an interesting approach, but normally we would want to version requirements and architecture documents separately. It might be better to leave them out of this UML model.

Benjamin

Sun, 2007-Sep-16

REST vs RPC

REST's approach to capturing semantics is different from the Remote Procedure Call (RPC) model common in SOAP-based services today. RPC places almost all semantics in the method of an operation. It can have preconditions and postconditions, service level agreements, and so forth attached to it. REST splits the semantics of each interaction into the three components. Most of REST theory can be boiled down to using URLs for identification, using the right transport protocol, and transporting the right data formats over it.

REST's interaction semantics can still be seen in its methods. We know from seeing a HTTP PUT request that information is being transferred from client to server. We can infer information such as cacheability from a completed HTTP GET operation. However, this is clearly not the whole interaction.

The next level of semantics is held in the document being transferred. Perhaps it is a purchase order, which can be interpreted in the same way regardless of the interaction context. Was the purchase order PUT to my service to indicate that the client wants to make a purchase, or did I GET the purchase order from a storage area? Either way, I can process the purchase order and make use of it in my software.

This leads to the final place where semantics are captured in REST's uniform interface: The context of the interaction, in particular the URL used. I could PUT (text/plain, "1") to my mortgage calculator program, and it might adjust my repayments based on a one year honeymoon rate. I could issue the same PUT (text/plain "1") to the defence condition URL at NORAD and trigger world war three.

This variability in possible impact is a good thing. It is a sign that by choosing standard interactions and content types we can make the protocol get out of the way of communication. Humans can make the decisions about which URL a client will access with its operations. Humans can make the decision about how to implement both the client and server in the interaction. Some shared understanding of the interaction must exist between the implementors of client and server in order for the communication to make sense, but the technology does not get in the way.

When you build your greenhouse abatement scheduling application, it won't just be able to turn the lights off at night. It will be able to turn off the air conditioning as well. When you build your stock market trending application it will be able to obtain information from multiple markets, and also handle commodity prices. Chances are, you'll be able to overlay seasonal weather forecast reports that use the same content types as well.

Moving the semantics out of the method might feel to some like jumping out of a plane without a parachute, but it is more like using a bungee rope. The loose coupling of REST means that applications work together without you having to plan for it quite as much, and that the overall architecture is able to evolve as changes are required.

SOA advocates: You like to tell us that you don't to RPC anymore, but what is the difference between RPC and your approach? It is interesting that the best google hit I found for this sort of information is this 2003 piece, by Hao He. The strange thing is, it sounds like REST's approach to me.

Here is another xml.com article, this time by Rich Salz. In his view the difference is that SOA separates "contract" from "validation". Is this the same point of view, or a completely different one? It can be hard to tell.

Another piece is by Yaron Goland. Again, I think it is making the same sort of point. This time that SOAP has failed to result in true SOA. Perhaps the only true SOA is one built on REST principles?

Whichever way you cut it, I think the important thing from the REST point of view is not to put too much context into the actual communications protocol. That should be agreed out of band, and be implicitly communicated through the chosen URL. Actual message structures consisting of an identifier, information about the requested interaction, and a payload should be as consistent as possible across as wide a range of problem domains as possible.

Benjamin

Thu, 2007-Aug-30

URI vs Resource

The concept of a resource is central to REST theory, but when talking about REST I rarely mention it. This may look like an oversight or a mistake, but it is deliberate. Let's walk through a sample conversation with a new developer about REST:

Developer
So tell me about this "REST" thing
REST Guy
Oh, it's great. It's so simple. It stands for representational state transfer. It's an architectural style with a bunch of constraints: Client/Server, Stateless between requests, Explicit cache control, Uniform Interface, Layering, Identification of resources, Manipulation via representations, Self-descriptive messages, Hypermedia as the engine of application state, and optional code on demand.
Developer
Wha wha wha?
REST Guy
Oh, the central tenant is resources. They're whatever you want them to be. They're only accessable via their URI and exchanged representations...
Developer
What on earth is a representation? Hang, on... I think I know what a URL is.. but...
REST Guy
But there's this whole question about whether resources with different URLs can be the "same" resource. Then there is the issue of whether you can have different representations of the same resource at different URLs, say the RSS and ATOM versions of a news feed.
Developer
OK. I'm going off to write some WSDL, now.

That conversation and the jargon around REST is a high barrier to entry that may be useful to computer scientists, but generally don't help architects or developers very much. In fact, I think the whole Resource/URL/Representation issue is an incredible waste of time and bandwidth for all concerned. The vague notion of a resource independent of identification and representation frankly doesn't add anything to our understanding of architecture.

Here is the conversation I try to have with developers or junior architects:

Developer
So tell me about this "REST" thing
REST Guy
It's great. You do what you would do in your existing SOA world, except you modify the interface a bit. Each time you define a new WSDL you make your app incompatible with other apps, but if you break up that WSDL interface into smaller objects with a preexisting interface other apps can more easily interact with yours.
Developer
Wha wha wha?
REST Guy
Just like multiple methods can modify the same data, operations on different URLs can retrieve and modify overlapping data. These operations should normally be either GET or PUT, but the important thing is that there is an architecture-wide catalogue of these operations and the data formats they transfer. We call REST "REST" for "Representational State Transfer". It's just jargon for "transferring information as well-known data types using retrieval and update operations". There are a few more constraints, but for the most part they are encapsulated by restricting yourself to GET and PUT.
Developer
What on earth is a representation? Hang, on... I think I know what a URL is.. but...
REST Guy
So you get awesome interoperability. The methods and data format are all controlled centrally, so two applications that share the same data model will very likely end up using the same data formats and methods. The result? Applications work together unexpectedly. Not only that, but you get awesome evolveability. You can deploy an app today, and without ever upgrading it you can keep it working with other applications. You can dynamically modify which URLs it interacts with using redirection. You can deal with its data formats becoming superseded through content negotiation. Method evolution is much harder, but methods are more stable and changes to the set of methods usually take backwards-compatibility into account. You get awesome visibility and control over the communications on the wire, awesome performance optimisation through caching, a great common vocabulary for talking about interfaces with other developers and architects... man... it's all upside.
Developer
OK. I'm going off to write some WSDL, now.

While you may not convince your co-workers on the first day, you need to at least meet them where they are today. You have to be able to talk about how REST affects the way they do things, now. One thing your co-workers don't care about and will likely never care about is the abstract notion of what a resource is. If they understand what a URL is, and they understand the interaction through standard data formats that REST affords there is really nothing more to understand.

Can the same resource be available at two resources? No. There is no point distinguishing between a resource and the URL it is provided at. If you have two URLs you have two resources. These resources may be equivalent for some specific purposes, but that is not to say they are the same. If their identifiers are different, they are not the same. Can representations of the same resource be available under multiple URLs? No. They are different URLs, so they are different resources that relate to the same underlying application data.

Benjamin

Sat, 2007-Aug-25

A Services View for REST Architectures

In my last article I wrote about the importance of views in communicating architectural information to specific stakeholders. I have struggled in trying to literally apply the 4+1 model to REST architecture, but I don't see REST as the problem. I think the same issues come up when talking about Service-Oriented Architecture. I would phrase the issue as follows:

I want to draw a object-style diagram that lists the services running in my architecture, and their clients. I want to identify all of the URLs provided by these services (in URL template form). I want to know who is providing them, and who is using them. I want to know what methods are available on each URL (GET, and/or PUT, and/or DELETE). I want to know which content types are supported by each URL.

This architectural view is the distributed software architectural of a wiring diagram. It allows me to quickly analyse whether a particular service is getting enough information to meet its requirements. It allows me to put off thinking about exactly how the services will be deployed or laid out in my source-code repositories. It lets me concentrate on the bigger picture.

So, this isn't the deployment (physical) view. I am not laying out the services on physical machines. It isn't the development view. I'm not thinking about library dependency structure. So, is it the logical view or the process view? Phillipe says that the logical view is like a class diagram, so that might be right. However, the process view is supposed to show which parts of the architecture work in parallel to each other, and how they interact. That also sounds familiar. I am not necessarily thinking about how many levels of redundancy I'll provide when I cluster a particular service. Phillipe says I should be showing that in the process view. However, the logical view is supposed to be customer-oriented: A functional breakdown. I'm not sure the services view is always going to meet that goal.

My approach at the moment is to treat the logical view as part of or as an extension of the requirements specification. It groups functions in a way that makes sense to the customer. The services view most closely matches up to the process view, so while I hesitate to actually call it a process view it occupies that spot on the classic 4+1 diagram. Don't get me started as to what should appear in the "+1" scenarios view.

The services view consists of a object per service, client, URL Template, and content type. Each URL Template has exactly one aggregation relationship, linking to a particular service. Clients and services may have dependency relationships on URL templates, and we would expect each URL to have at least one dependent object. Each URL template has relevant GET, PUT, and DELETE methods as explicit UML operations with one specific content type parameter for PUT and a specific return content type for each GET. GET and PUT appear as many times as necessary to cover content negotiation supported by the URLs. Normally this means at most one PUT and at most one GET. Other supported content negotiation (eg language) could be incorporated in the same way. The Uniform Interface does not appear explicitly in the model, but can be inferred from the total collection of content types and URL methods.

Building this into a UML model allows me to run various validation checks to make sure architectural constraints I care about are enforced. It also allows easy modification as requirements change or problems are discovered. Non-REST services can be incorporated in similar way, with URLs that have less standard methods. Ad hoc protocols can be also incorporated in to the diagram.

I find this view to be a useful tool in my arsenal, and explaining REST to my developers is somewhat of a non-issue. I am the architect, so obviously I need to know what I am doing. After that, all I have to do is win enough arguments and review enough documents to ensure my specification evolves sensibly and is followed.

Benjamin

Sat, 2007-Aug-25

Understanding REST

Charles Savage and Alex Bunardzic are both talking about how REST (or ROA) seems hard to understand. Charles has been talking about how we are not looking at the big picture. Alex talks about the importance and subtlety in distinguishing between resources and URLs.

I think the problem is rooted firmly in the domain of software architecture. Software architecture is something that most software developers fail at, and as an industry we are only just starting to pick apart in useful ways. The principal finding of IEEE 1471 "Recommended Practice for Architectural Description of Software-Intensive Systems" is that architecture has different stakeholders, an that each stakeholder needs to see a limited subset of the overall set of information to make appropriate decisions. What IEEE 1471 doesn't dictate is who those stakeholders will be, and what they need to know.

Roy's thesis is REST explained to computer scientists. Its principle function is to distinguish the REST style from alternative styles. Its secondary function is to derive properties of REST architecture. Its stakeholders are academics, not architects or software developers. Roy calls it an architectural style. It may be more appropriate to call it an architectural view; one that can be shared between numerous actual architectures.

Roy's view hides information about RESTful architectures not relevant to his stakeholders. It doesn't cover a 4+1-style process view that would describe the exact interactions between services and their clients. It doesn't contain a deployment view that demonstrates how services are distributed across physical machines. It doesn't contain a development view that describes how written software is laid out in directory structures under various source-code repositories. It doesn't document the APIs that developers should write their services against.

In short, there are a lot of gaps to fill in. There are a lot more views to populate before you can hand over a specification document and ask your team to go write some code. There are a lot of ways of writing those views, both conforming to and conflicting with Roy's base view... and a lot of communities and thought leaders to turn in the same direction before you have any sort of wide-spread acceptance and understanding of a common way forward.

The strategy of individuals and groups who have an interest in turning those ships should be to meet them where they are now, to solve real problems, and to fill out those additional architectural views as appropriate for those communities.

Benjamin

Sat, 2007-Aug-04

Reliable POST (reliable remote state creation)

Stefan Tilkov points to a recent work by Joe Gregorio: RESTify Day Trader

He highlight's Joe's suggestion to use PUT to reliably create server-side state. In particular, to lodge a purchase order. I have been suggesting this kind of approach for a while, but I have issues with Joe's specifics. I am particularly focused on how automated REST clients and services interact, so I take things a little beyond whatever works in the browser. I have issues with Joe's POST precursor and 303 response.

I'm not sure the POST used by Joe is entirely necessary. PUT is allowed to create the resource at the request URI, so creating it using a POST in an earlier request is not strictly required. The only thing necessary is for the client to discover or construct the url of the order that it is about to submit. I suggest either a non-cacheable GET that returns a different URL each time, or construction of the url with a client-supplied guid will be appropriate. Which is the most appropriate will depend on the exact situation.

Preferred approach:

>> GET https://example.com/orderForm
(repeat if necessary)
<< 200 OK, form with https://example.com/order/1000 as submit element
   (non-cacheable, with a different submit url for each request)
>> PUT https://example.com/order/1000 (data)
(repeat if necessary)

Fallback approach:

>> PUT https://example.com/orders?guid=client-supplied-guid (data)
(repeat if necessary)

I also wonder about the use of 303. rfc2616 is clear:

"The new URI is not a substitute reference for the originally requested resource"

It isn't appropriate for a client to take this url and use it as the url of its order. The most it should do is retrieve the response to its PUT request using an additional GET request.

There is no need to "move" the resource to a new URL. The URL is its identity. This identity should not include the state of the order, and the identity of this resource should not change. The URL that the client issued the PUT to should be the one it continues to interact with for GET or PUT transactions with the order.

The move also puts stress on PUT's idempotency: If the client issues the request again after the move has occured, is it going to issue an additional order? Will it get a 410 Gone response? The former is a bug, while the latter is difficult to interpret. Did the server reject the request, or was it successful? 410 is unavoidable if the resource is DELETED before the client finishes repeating its requests, however in my view it should be an exceptional rather than a common case.

Benjamin.

Sat, 2007-Jul-14

The War between REST and WS-*

David Chappell posits that the "war" between REST and WS-* is over. The evidence for this is that platforms such as .NET and Java that have traditionally had strong support for WS-* are now providing better tools for working with REST architectures.

The War

Is the war over? I think it we are getting towards the end of the first battle. WS-* proponents who see no value in REST are now a minority, even if some of those who accept REST's value are only beginning to understand it (both REST, and its value). Moreover, I think that this will in the long run be seen as a battle between Object-Orientation and REST. That battle will be fought along similar lines to the battle between Structured Programming and Object-Orientation.

In the end, both Object-Orientation and Structured Programming were right for their particular use case. Object-Orientation came to encapsulate structured programming, allowing bigger and better programs to be written. We still see structured programming in the methods of our objects. The loops and conditionals are still there. However, objects carve up the state of an application into manageable and well-controlled parts.

My view is that the REST vs Object-Orientation battle will end in the same way. I believe that REST architecture will be the successful style in large information systems consisting of separately-upgradable parts. I take the existing Web as evidence of this. It is already the way large-scale software architecture works. REST accommodates the fact that different parts of this architecture are controlled by different people and agencies. It deals with very old software and very new software exist in the same architecture. It codifies a combination of human and technical factors that make large-scale machine cooperation possible.

The place for Object-Orientation

We will still use Object-Orientation at the small scale, specifically to build components of REST architecture. Behind the REST facade we will build up components that can be upgraded as a consistent whole. Object-Orientation has been an excellent tool for building up complex applications when a whole-application upgrade is possible. Like traditional relational database technology, it is a near perfect solution where the problem domain can be mapped out as a well-understood whole.

Hard-edged Object-Orientation with its crinkly domain-specific methods finds it hard to work between domains, and nearly impossible to work across heterogeneous environments where interacting with unexpected versions of unexpected applications is the norm. Like Structured Programming before it, the Object-Orientated abstraction can only take us so far. An architectural style like REST is required to build much larger systems with better information hiding.

Conclusion

To me, the "war" is over. REST wins on the big scale. Object-Orientation and RDBMS win at the small scale. The remaining battlefield is the area between these extremes. Do we create distributed technology based on Object-Oriented principles using technology like Corba or WS-*, or do we construct it along REST lines?

Like David, I see the case for both. Small well-controlled systems may benefit from taking the Object-Oriented abstraction across language, thread, process, and host boundaries. However, I see the ultimate value of this approach as limited. I think the reasons for moving to a distributed environment often relate to the benefits which REST delivers at this scale, but Object-Orientation does not.

Addendum

Mark Baker picks up on a specific point in David's article. David says that REST is fine for CRUD-like applications. Mark essentially counters with "incorporate POST into your repertoire". This is where I have to disagree. I think that any operation that is sensible to do across a network can be mapped to PUT, DELETE, GET, or SUBSCRIBE on an appropriate resource. I see the argument that POST can be used to do interesting things as flawed. It is an unsafe method with undesirable characteristics for communication over an unreliable network.

My CRUD mapping is:

C or U
PUT
R
GET
D
DELETE

I then add trigger support as SUBSCRIBE.

Mark's example is the placement of a purchase order. I would frame this request as PUT https://example.com/purchase-order/1000 my-purchase-order. This request is idempotent. My order will be placed once, no matter how many times I execute it. All that is needed is for the server to tell me which URL to use before I make my request, or for us to share a convention on how to select new URLs for particular kinds of resources. Using POST for this kind of thing also has the unfortunate effect of creating multiple ways to say the same thing, something that should be avoided in any architecture based on agreement between many individuals and groups.

In my view, the main problem with the CRUD analogy is that it implies some kind of dumb server. REST isn't like this. While your PUT interaction may do something as simple as create a file, it will be much more common for it to initiate a business process. This doesn't require a huge horizontal shift for the average developer. They end up with the same number of url+method combinations as they would have if they implemented a non-standard interface. All they have to do is replace their "do this" functions with "make your state this" PUT requests to an appropriate url. REST doesn't change what happens behind the scenes in your RDBMS or Object-Orientated implementation domain.

Benjamin

Sun, 2007-Jul-08

HTTP Response Codes

There are too many HTTP response codes.

One of the first questions asked when someone tries to develop an enterprise system around is which of these codes is important to support in client applications. The short answer is "all of them", but it comes down to which codes mean something to your application. The following table summarises my defaults for automated clients. Drop me a line if you have comments or suggestions, <benjamincarlyle at soundadvice.id.au>. Codes are from rfc2616, HTTP/1.1:

CodeReason PhraseTreatment
100 Continue Indeterminate
101 Switching Protocols Indeterminate
1xx Any other 1xx code Failed
200 OK Success
201 Created Success
202 Accepted Success
203 Non-Authoritative Information Success
204 No Content Success
205 Reset Content Success
206 Partial Content Success
2xx Any other 2xx code Success
300 Multiple Choices Failed
301 Moved Permanently Indeterminate
302 Found Indeterminate
303 See Other Indeterminate
304 Not Modified Success
305 Use Proxy Indeterminate
307 Temporary Redirect Indeterminate
3xx Any other 3xx code Failed
400 Bad Request Failed
401 Unauthorized Indeterminate
402 Payment Required Failed
403 Forbidden Failed
404 Not Found Success if request was DELETE, else Failed
405 Method Not Allowed Failed
406 Not Acceptable Failed
407 Proxy Authentication Required Indeterminate
408 Request Timeout Failed
409 Conflict Failed
410 Gone Success if request was DELETE, else Failed
411 Length Required Failed
412 Precondition Failed Failed
413 Request Entity Too Large Failed
414 Request-URI Too Long Failed
415 Unsupported Media Type Failed
416 Requested Range Not Satisfiable Failed
417 Expectation Failed Failed
4xx Any other 4xx code Failed
500 Internal Server Error Failed
501 Not Implemented Failed
502 Bad Gateway Failed
503 Service Unavailable Repeat
504 Gateway Timeout Repeat if request is idempotent, else Failed
505 HTTP Version Not Supported Failed
5xx Any other 5xx code Failed

Key:

Repeat
The client SHOULD repeat this request, taking into account any delay specified in the response. If the client chooses not to repeat the request, it MUST treat the transaction as a failed.
Success
Do not repeat. The client SHOULD treat this as a successful transaction, however specific codes may require more subtle interpretation in unusual environments.
Failed
Do not repeat unchanged. A new request that takes the response into account MAY be generated. The client SHOULD treat this as a failed transaction if a new request cannot be generated, however specific codes may require more subtle interpretation in unusual environments.
Indeterminate
Do not repeat unchanged, but this code MUST be understood. A new request that takes the response into account SHOULD be generated. If the client is unable to generate a new request this code MUST be treated as Failed.

Redirection

Redirection is essential to the identifier evolution mechanism in the uniform interface. Consider a typical client: It consists of configuration and code that enables it to interact with resources elsewhere in the network. However, the configuration may become out of date if it is installed in a client that runs over a very long period of time. The url spaces the client is configured to interact with may be restructured. Redirection response codes allow the servers that provide a particular url space to reconfigure their clients and allow them to keep operating.

Note that the HTTP specification requires reasonable limits to be placed on redirection, and demands human confirmation in cases where redirection might cause the request to mean something different to what the client intended. The right way to deal with this is usually to allow the user to define limits to the way redirection is handled. For example, a redirection profile may be supplied that only allows redirection to urls that have the same domain name as the original url. Redirection loops also need to be explicitly defeated.

Some commentary:

CodeReason PhraseCommentary
100 Continue I'm not a fan of this code. It adds latency to valid transactions, optimising for the uncommon failure case. It adds a "multi-callback" mechanism. It returns multiple responses for a single request. I have put this kind of mechanism into place in proprietary systems I have developed. Unfortunately, this multi-callback mechanism isn't implemented in any kind of general way. My experience from building systems with multi-callbacks also suggests that they don't really work in fast-failover high availability environments without an excessively-expensive keep-alive mechanism.
101 Switching Protocols An upgrade path to WAKA or HTTP/2.0, if it ever arrives. Personally, I think we'll still have devices speaking HTTP/1.x (possibly still HTTP/1.0) in a fifty to a hundred years. Any protocol that hopes to displace HTTP/1.x has to do so on the Web to be successful. That's a huge undertaking.
200 OK Yay, success
201 Created Equivalent to 200 OK for a PUT. When returned from POST includes a useful Location field. However, POST is non-idempotent and unsafe. My preference is not to use it, especially in pure machine-to-machine architectures.
202 Accepted It is hard to know what to do with this code. It is equivalent to an SMTP-style store-and-forward. The best that an automated client can do is assume success. Perhaps a way to retrieve the real response asynchronously will be devised some day.
203 Non-Authoritative Information Only useful for GETs, and returned by a proxy.
204 No Content This is really a terrible code. It tells a HTTP response parser not to expect a body, meaning that there is really no other way to say this. It would be nice to separate out the content/no-content decision from the response code. It has the side-effect (or main purpose, depending on your perspective) of telling a web browser not to replace the current page.
205 Reset Content This is another brower-directed code. It doesn't really make sense for automated clients, except at yet-another-success code.
206 Partial Content Success for a partial GET, an artifact of not knowing in advance whether the server supports such a thing. This sort of thing reveals the evolution that HTTP has gone through over the years, and is a bit creaky. However, that evolution also demonstrates the essential method evolution mechanisms available in HTTP.
300 Multiple Choices Content negotiation by different URLs. Possibly not the best approach, but content negotiation is essential for an architecture that has an evolving content-type pool. 300 Multiple Choices, or the Accept header? So far Accept seems to be out in front. The main weakness of this code is probably the lack of a defined automatic negotiation mechanism. Automated clients can't really make use of this code.
301 Moved Permanently Essential identifier evolution code
302 Found More identifier tweaking
303 See Other A request in two parts: A second request is required to retrieve the result of the first. Is this a better "202 Accepted"? Perhaps the second request can return 503 Service Unavailable until the response is ready.
304 Not Modified This response code is specific to GET, and indicates that the client already has an up-to-date representation of the resource state. It isn't relevant to other response codes, and is essentially another success code.
305 Use Proxy This is a potentially neat way to introduce an intermediary. Do you need to introduce a specialised authenticating proxy? Perhaps you want to introduce a proxy to validate schemas, and otherwise police service-level agreements. If clients understand this code, all of this is possible.
307 Temporary Redirect Another useful code. Not sure you are ready to commit to a permanent move? Do you need to set up a temporary service during transitional arrangements? Ensuring clients support this code allows freedom in how services are managed during these times.
400 Bad Request This is where things get boring for automated clients. A typical automated client knows what request it wants to make to which URL. If the server is unable to process the request for any but the simplest of reasons, there is nothing a typical client can do. Log the failure, and do your best to continue working. Only a human can really fix many of these problems, so distinguishing between them in a machine-readable way doesn't carry as much value as you might think. In those cases the reason phrase or enclosed entity will be of more use than the code itself.
401 Unauthorized This code is used when an automated client doesn't want to supply credentials by default. It gets the code, and retries with credentials. If that doesn't work the client can try any alternate credentials it might have, but this set is typically limited.
402 Payment Required Don't use, doesn't work. Maybe one day the whole Web will work like the Web we see on mobile phone networks, where we have a pay-per-view controlled content world. I'm sure there are interests out there who want this. From an economic perspective it seems likely that there will eventually be some level of consolidation and reorganisation of the supply relationship on the Web. However, for now different channels are in use when payment is required.
403 Forbidden "Please don't call again". An automated client can't do anything with this code, except give up.
404 Not Found Give up, client... unless you were trying to DELETE. If you were, you just might have succeeded.
405 Method Not Allowed Please try a different method... for "GET"? If the client want to participate in a GET interaction, another interaction won't do as a substitute. This method may allow for evolution in the method-space of HTTP, so probably isn't a complete write-off. However, the set of methods in HTTP is fairly stable. It needs to be.
406 Not Acceptable I can't interact with you. Give up.
407 Proxy Authentication Required Similar to 401 Unauthorised: useful.
408 Request Timeout This method looks like 503 Service Unavailable, but indicates that the client is at fault. Should this be a "Repeat" or "Failed" response? It's hard to tell. The spec certainly allows for a repeat, but if the client continues to fail to deliver its request on time the response will continue to be returned. I'm 50/50 about this. If you can't get your request through once, perhaps the chances of ever getting it right seem less than certain.
409 Conflict "You can't send this request now, but maybe if something changes you can try again. It's not you, it's me. Can we still be friends?". Don't put up with it. Dump that server. Log it, and move on.
410 Gone Your configuration is out of date. What's a client to do, but give up? Again, log it and leave it.
411 Length Required "You aren't giving me enough information to proceed". Well? That's how I was written. Nothing you can do.
412 Precondition Failed "You asked me to do this only if that was true. Well, it wasn't.". Is it a success code because the request was fully fulfilled, or a failure? This code can be used to resolve conflicts in environment, but your client has to know how to resolve conflicts. That typically involves consulting a user. If you didn't ask for this code you won't get it. If you did ask for it, you'll know what to do with it.
413 Request Entity Too Large "Ha, stopped at the firewall". Log it and give up.
414 Request-URI Too Long Ditto
415 Unsupported Media Type Is this a code for content negotiation of a PUT request, or simply to refuse to accept a particular Content-Encoding header. The lack of a list of acceptable content types suggests that is only the latter. You might be able to resubmit in a way the server might accept, but your chances are slim. You and the server don't seem to be on the same page, and submitting request after request to figure out how to communicate isn't necessarily going to help.
416 Requested Range Not Satisfiable Specific to Range request header. If you didn't ask for it, you won't get it. You need to forget about partial GETs for the moment and figure out where you are at again with the server.
417 Expectation Failed The client requested a particular extension or set of extensions to the protocol that the server doesn't support. What do you do? Why did you need the extension? Are these kinds of extension valuable enough to offset the complexity involved in dealing with older servers? Besides: Must-understand is so passé.
500 Internal Server Error HTTP tries to be nice in separating out things that are the server's fault and things that are the client's fault. Unfortunately, we have already seen cases where the classification is ambiguous. Worse, whether the client or server is at fault isn't all that useful a classification for automated clients. It is more important to know whether it can retry the request safely, or not. Internal Server Error is no good for this. It will often have come from back-stop exception handling code in the web server, or something equally generic. Something might have happened. The whole operation might have been wholely successful, or not at all. The only thing we can figure out as a client is that reissuing the request isn't a solution. If we are extremely lucky it might actually help, but trying is just going to mask a problem and increase network traffic. This is one to give up on.
501 Not Implemented How this differs from 405 Method Not Allowed is a bit of a fine point, and the two responses should be treated the same way. This one indicates that no resource provided by this server supports the method. In the modern days of servlets this is a hard statement to make. It seems like 405 Method Not Allowed is the right answer when a method is not implemented by a particular resource. Unfortunately, 405 requires that we also list the methods that are valid to use... not that the client is likely to be able to make use of this list. Generating this list and keeping it in sync with reality is harder that you might assume in various web frameworks: Rock, hard place.
502 Bad Gateway The proxy or gateway you are talking to thinks that the next server up the line doesn't speak valid HTTP. The request was sent, but did it work? Unknown. We know that the same thing is likely to happen if we resubmit the request, so we need to give up.
503 Service Unavailable The service is temporarily down, and may provide instruction about when to retry. This can be a useful code that all clients should support.
504 Gateway Timeout This code is very similar to 502 Bad Gateway. In the 502 case our proxy didn't understand the response from the upstream server. In 504, it didn't receive the response. Either way, the request was sent. Either way, we don't know what happened. However 504 models a reasonably common case of a server failing during a request. For this reason it is useful to generate from libraries that speak HTTP on behalf of any server that times out. Retrying makes sense in this environment, though care should be taken to avoid retrying forever. After all, it may be a real timeout. Perhaps the server simply can't process your request and return a response in the time being allocated to it. This highlights a weakness in HTTP whereby the behaviour of a failed server can be aliased with the behaviour of a slow server. A better solution to detecting server death would be to send keep-alive messages down the connection at regular intervals ahead of the pending response. The exact solution could be a client-initiated ping while requests are outstanding, and pong messages are returned before the response. Alternatively, a standard-mandated but configurable HELO message that is sent at a regular interval would do. TCP keepalive in one or more directions may be a way to achieve this without fundamental protocol changes. A server that is processing a long request but can still maintain ping traffic won't be confused with a dead server. If the server is stuck in some kind of tight loop but continues to manage keepalives it is up to something on the server side to eventually kill it off and fix the problem for the client. This is what I call a "High Availability profile for HTTP", supporting individual clients with fast failover requirements.
505 HTTP Version Not Supported You no speak-a the HTTP? Give up.

Benjamin

Sun, 2007-Jul-01

Nouns, Verbs, Uniform Interfaces, and Mature Architecture

The verbs in a typical RESTful system are fairly simple and easy to understand. PUT, GET, and DELETE are conceptually straightforward. A client sets the state of a named resource. A client retrieves the state of a named resource. A client destroys a named resource. However, the devil is in the detail.

The Linguistic Perspective, or the Uniform Interface?

It is tempting to view REST from a linguistic perspective. We can say things like "we are trading more nouns for fewer verbs", and take the Web as an example as to why this is a good thing. However, it is difficult from a purely linguistic perspective to see exactly where the goodness comes from. Why is it better to have more nouns and less verbs? Where does the uniform interface fit into this? Do I have a uniform interface when I use uniform verbs, or is something more required?

I like to turn this analysis around. Instead of starting with the linguistic perspective, I start from the uniform interface perspective. Let me define the uniform interface:

We have a uniform interface when every component within an architecture can arbitrarily and successfully engage in an authorized interaction with another component in the architecture without prior planning.

The Uniform Interface and REST

In REST we divide the set of components in our architecture into clients and servers. Clients are anonymous and unaddressable in the general case. Servers contain resources that are addressable. When I say every component can talk to every other component we already have to take this with a pinch of salt. In REST we can talk about every client being able to interact with every resource.

We can add another pinch of salt: It isn't always meaningful or sensible for a particular client to talk to a particular server. They might move in different machine "social circles", and talk about different kinds of things. They may develop their own jargon and sub-culture. In fact, machines tend to follow the needs and inclinations of their human users. The fact that machines tend to cluster in sub-architectures shouldn't come as any kind of surprise. It isn't always going to be meaningful for a Web-enabled coffee machine to talk directly to a nuclear power plant's monitoring system. It isn't always going to make sense for a personal video recorder who is only interested in program listings to talk to a weather service.

This means that we can only really talk about the uniform interface in terms of any client of a sub-architecture being able to engage in authorized interactions with any resource in the same sub-architecture. Sub-architectures overlap in the same way that sub-cultures do, so this statement about the nature of uniform interfaces is a fairly fluid one. Value is achieved by participating in the largest sub-architectures, and by keeping down the overall number of sub-architectures your software participates in.

The Uniform Interface and Protocol

Now that we are looking from the uniform interface perspective, it pays to take another look at our verbs. It soon becomes apparent that "GET" is not enough. We need our client and server to agree on a lot more than that verb in order to interact. Client and server must agree on a protocol. The whole of the message fashioned by the client needs to fits with the server's mechanisms for interpreting that message. More-over the response that the server generates is part of the interaction. The client must be able to process the message that the server returned to it.

With the uniform interface view firmly in our minds, we can look back over the mechanisms that REST uses to decouple parts of its protocols. In a sense, it might not matter that we decouple identification from verbs and from content type. If one protocol governs a particular sub-architecture, it doesn't matter how it is composed. It only matters that the components of that sub-architecture can speak the protocol.

However, taking this perspective misses one of the key features of the REST style: It is designed to allow our protocol to evolve. Experience suggests that controlling the specification of the URI, the transport protocol (including verbs, headers, and response codes), and content types provides excellent characteristics for evolution.

Participating in sub-architecture

REST is the style for a new kind of architecture. In days of yore we might have expected that we would replace the architecture of our old system with each new system we deploy. We could afford in those days to rip out systems wholesale, and replace them with new systems. I suggest that this is no longer the case. It is increasingly important to be able to replace systems within an architecture, rather than replacing the architecture per se. Other systems need to be able to deal with evolution within their architecture. Protocols and architecture need to be well beyond SOA-class. They need to be Web-class, and Web-style.

Looking back from the uniform perspective lets us see the linguistic debate in a new light. Of course it is possible to introduce new verbs If we are willing to break the uniform interface. Of course this allows us to map the interactions we have within our system onto fewer resources or network-exposed objects. However, doing so greatly harms our opportunities for network effects. Our sub-architecture is reduced to the set of components that directly speak this new protocol we define, and there is reduced opportunity for servers and clients to participate in overlapping sub-architectures. The exact balance between nouns and verbs is a matter for ongoing community debate, but using custom verbs where standard ones would do is an essentially indefensible approach. Just by introducing the necessary resources is possible to prevent divergence from the uniform interface. It is possible to increase the value, evolvability, and overall maturity of your solution. You can always map the methods of these resources onto a smaller set of objects behind the protocol curtain.

Conclusion

The balance between nouns and verbs is a matter for the community that governs a particular architecture. Going it alone is counter-productive, potentially risky, and usually unnecessary. Instead of looking to custom verbs, think more about your content types. That is where most of the real challenges of protocol evolution and maturity lie.

Benjamin

Sun, 2007-Jun-17

On ODBMS versus O/R mapping

Debate: ODBMS sometimes a better alternative to O/R Mapping?

Objects see databases as memento and object-graph storage. Databases see objects as data exposed in table rows. RDF databases see objects data exposed in schema-constrained graphs. The private of one is the public of the other. The benefits of each conflict with the design goals of the other.

Perhaps REST is the middle ground that everyone can agree on. Objects interface easily using REST. They simply structure their mementos as standard document types. Now their state can easily be stored and retrieved. Databases interface easily using REST. They just map data to data. So the data in an object and the data in a database don't necessarily have precisely-matched schemas. They just map to the same set of document types and these document types define the O-R mapping. The document type pool can evolve over time based on Web and REST principles, meaning that tugs from one side of the interface don't necessarily pull the other side in exactly the same direction.

If O-R mapping is the Vietnam of computer science, perhaps we should stop mapping between our object and our relational components. Perhaps we should start interfacing between them, instead.

Benjamin

Mon, 2007-Jun-11

The Web Application Description Language 20061109

The Web Application Description Language (WADL) has been touted as a possible description language for REST web services. So what is a REST Description Language, and does this one hit the mark?

The Uniform Interface Perspective

I have long been a fan, user, and producer of code generation tools. When I started with my current employer some seven or eight years ago, one of my first creations was a simple language that was easier to process than C++ to define serializable objects. I'm not sure I would do the same thing now, but I have gone on to use code generation and little languages to define parsers and all manner of convenient structures. It can be a great way to reduce the amount of error-prone rote coding that developers do and replace it with a simplified declarative model.

I say that I wouldn't do the serialisation of an object the same way any more. That's because I think there is a tension between model-driven approaches such as code generation and a "less code" approach. Less code approaches use clever design or architecture to reduce the amount of rote code a developer writes. Instead of developing a little language and generating code, we can often replace the code that would have been generated by simpler code. In some cases we can eliminate a concept entirely. In general, I prefer a "less code" approach over a model-driven approach. In practice, both are used and both are useful.

One of the neat things about REST architecture is that a whole class of generated code disappears. SOA assumes that we will keep inventing new protocols instead of reusing the ones we have. To this end, it introduces a little language in the form of an IDL file definition and includes tools to generate both client and server code from IDL instances. In contrast, REST fixes the set of methods in its protocols. By using clever architecture, the code we would have generated for a client or server stub can be eliminated.

In a true REST architecture, both the set of methods associated with the protocol (eg GET, PUT, DELETE) and the set of content types transferred (eg HTML, atom, jpeg) are slow-moving targets compared to the rate at which code is written to exploit these protocols. Instead of being generated, the code written to handle both content and content transfer interactions could be written by hand. Content types are the most likely targets to be fast-moving and are probably best handled using tools that map between native objects and well-defined schemas. Data mapping tools are an area of interest for the w3c.

So does this leave the WADL little language out in the cold? Is there simply no point to it?

I think that is a question that is tied to a number of sensitive variables that will depend on where you are on the curve from traditional SOA to REST nirvana. It is likely that within a single organisation you will have projects at various points. In particular, it is difficult to reach any kind of nirvana where facets of the uniform interface are in motion. This could be for a number of reasons, the most common of which is likely to be requirements changes. It is clear that the more change you are going through the more tooling you will need and benefit from in dealing with the changes.

The main requirement on a description language that suits the uniform interface as a whole is that it be good at data mapping. However this specification may or may not be the same as suits specific perspectives within the architecture.

The Server Perspective

Even if you are right at the top of the nirvana curve with a well-defined uniform interface, you will need some kind of service description document. Interface control in the REST world does not end with the Uniform Interface. It is important to be able to concisely describe the set of URLs a service provides, the kinds of interactions that it is valid to have with them, and the types of content that are viable to transfer in these interactions. It is essential that this set be well-understood by developers and agreed at all appropriate levels in the development group management hierarchy.

Such a document doesn't work without being closely-knit to code. It should be trivial from a configuration management perspective to argue that the agreed interface has been implemented as specified. This is simplest when code generated from the interface is incorporated into the software to be built. The argument should run that the agreed version of the interface generates a class or set of classes that the developer wrote code against. The compiler checks that the developer implemented all of the functions, so the interface must be fully implemented.

The tests on the specification should be:

  1. Does it capture the complete set and meaning of resources, including those that are constructed from dynamic or configuration data and including any query parts of urls?
  2. Does it capture the set of interactions that can be had with those resources, eg GET, PUT and DELETE?
  3. Does it capture the high-level semantic meaning of each interaction, eg PUT to the fan speed sector resource sets the new target fan speed?
  4. Does it capture the set of content types that can be exchanged in interactions with the resource, eg text/plain and application/calendar+xml?
  5. Does it defer the actual definition of interactions and content types out to other documents, or does it try to take on the problem of defining the whole uniform interface in one file? The former is a valid and useful approach. The latter could easily mislead us into anti-architectural practice.

I admit this point is a struggle for me. If we make use of REST's inherent less-code capability we don't need to generate any code. We could just define a uniform interface class for each resource to implement, and allow it to register in a namespace map so that requests are routed correctly to each resource object. This would result in less code overall, but could also disperse the responsibility for implementing the specification. If we use generated code, the responsibility could be centralised at the cost of more code overall.

The Client Code Perspective

To me, the client that only knows how to interact with one service is not a very interesting one. If the code in the client is tied solely to google, or to yahoo, or to ebay, or to amazon... well... there is nothing wrong with that. It just isn't a very interesting client. It doesn't leverage what REST's uniform interface provides for interoperability.

The interoperable client is much more interesting. It doesn't rely on the interface control document of a particular service, and certainly doesn't include code that might be generated from such a document. Instead, it is written to interact with a resource or a set of resources in particular ways. Exactly which resources it interacts with is a matter for configuration and online discovery.

An interoperable client might provide a trend graph for stock quotes. In this case it would expect to be given the url of a resource that contains its source data in the form of a standard content type. Any resource that serves data in the standard way can be interacted with. If the graph is able to deal with a user-specified stock, that stock could either be specified as the url to the underlying data or as a simple ticker code. In the former case the graph simply needs to fetch data from the specified URL and render it for display. In the latter case it needs to construct the query part of a URL and append it to the base URL it has been configured with. I have mentioned before that I think it is necessary to standardise query parts of urls if we are to support real automated clients, so no matter which web site the client is configured to point to they should interpret the url correctly.

Again we could look at this from an interface control perspective. It would be nice if we could catalogue the types of resources out there in terms of the interactions they support and with which content types. If we could likewise catalogue clients in terms of the interactions they support with which content types we might be able to interrogate which clients will work with which resources in the system. This might allow us to predict whether a particular client and server will work together, or whether further work is required to align their communication.

Everywhere it is possible to configure a URL into a client we might attempt to classify this slot in terms of the interactions the client expects to be able to have with the resource. A configuration tool could validate that the slot is configured against a compatible resource.

I have no doubt that something of this nature will be required in some environments. However, it is also clear that above this basic level of interoperability there are more important high-level questions about which clients should be directed to interact with which resources. It doesn't make sense and could be harmful to connect the output of a calculation of mortgage amortization to a resource that sets the defence condition of a country's military. Semantics must match at both the high level, and at the uniform interface level.

Whether or not this kind of detailed ball and socket resource and client cataloging makes sense for your environment will likely depend on the number of content types you have that mean essentially the same thing. If the number for each is "one" then the chances that both client and resource can engage in a compatible interaction is high whenever it is meaningful for such an interaction to take place. If you have five or ten different ways to say the same thing and most clients and resources implement only a small subset of these ways... well then you are more likely to need a catalogue. If you are considering a catalogue approach it may still be better to put your effort into rationalising your content types and interaction types instead.

The non-catalogue interoperable client doesn't impose any new requirements on a description language. It simply requires that it is possible to interact in standard ways with resources and map the retrieved data back into its own information space. A good data mapping language is all it needs.

The Client Configuration Perspective

While it should be possible to write an interoperable client without reference to a specific service's interface control document, the same cannot be said for its configuration. The configuration requires publication of relevant resources in a convenient form. This form at least needs to identify the set of resources offered by the service and the high-level semantics of interactions with the resource. If we head down the catalogue path, it may also be useful to know precisely what interactions and content types are supported by the resource.

The requirements of mass-publication format differ from those of interface control. In particular, a mass-publication of resources is unable to refer to placeholder fields that might be supplied by server-side configuration. Only placeholders that refer to knowledge shared by the configuration tool and the server can be included in the publication.

WADL

Of all these different perspectives WADL is targeted at the interface control of a service. I'm still thinking about whether or not I like it. I have had a couple of half-hearted stabs and seeing whether I could use it or not. If I were to use it, it would be to generate server-side code.

I have some specific problems with WADL. In particular, I think that it tries to take on too much. I think that the definition of a query part of a URL should be external to the main specification, as should the definition of any content type. These should be standard across the architecture, rather than be bound up solely in the WADL file. I note that content type definitions can be held in external files at least.

I'm still thinking about if and how I would do things differently. I guess I would try to start from the bottom:

  1. Define the interactions of the architecture. Version this file independently, or each interaction's file independently.
  2. Define the content types of the architecture. Version each file independently.
  3. Define the set of url query parts that can be filled out by clients independently of a server-provided form. Version each file independently.
  4. Specify a service Interface Control Document (ICD) that identifies each of the resources provided by the service. It should refer to the various aspects of the uniform interface that the resource implements, including their versions. I wouldn't try to specify request and response sections in the kind of freeform way that wadl currently allows. Version this file independently of other ICDs.
  5. Specify a mass-publication format. It should fill a similar role to the ICD, but be be more focused on communicating high-level semantics to configuration tools. For example, it might have tags attached to each resource for easy classification and filtering.

Conclusion

I think that discussion in the REST description language area is useful, and could be heading in the right direction. However, I think that as with any content type it needs to be very clearly focused to be successful. We have to be clear as to what we want to do with a description language, and ensure that it isn't used in ways that are anti-architectural. I'm sure we have quite a way to go with this, and that there are benefits in having a good language in play.

Benjamin

Mon, 2007-Jun-11

Lessons of the Web

Many people have tried to come up with a definitive list of lessons from the Web. In this article I present my own list, which is firmly slanted towards the role of the software architect in managing competing demands over a large architecture.

One of the problems software architects face is how to scale their architectures up. I don't mean scaling a server array to handle a large number of simultaneous users. I don't mean scaling a network up to handle terabytes of data in constant motion. I mean creating a network of communicating machines that serve the purposes of their users needs at a reasonable price. The World-Wide Web is easy to overlook when scouting around for examples of big architectures that are effective in this way. At first, it hardly seems like a distributed software architecture. It transports pages for human consumption, rather than being a serious machine communication system. However, it is the most successful distributed object system today. I believe it is useful to examine its success and the reasons for that success. Here are my lessons:

You can't upgrade the whole Web

When your architecture reaches a large scale, you will no longer be able to upgrade the whole architecture at once. The number of machines you can upgrade will be dwarfed by the overall population of the architecture. As an architect of a large system it is imperative you have the tools to deal with this problem. These tools are evident in the Web as separate lessons.

Protocols must evolve

The demands on a large architecture are constantly evolving. With that evolution comes a constant cycling of parts, but as we have already said: You can't upgrade the whole Web. New parts must work with old parts, and old parts must work with new. The old Object-oriented abstractions of dealing with protocol evolution don't stack up at this scale. It isn't sufficient to just keep adding new methods to your base-classes whenever you want to add an address line to your purchase order. A different approach to evolution is required.

Protocols must be decoupled to evolve

A key feature of the Web is that it decouples protocol into three separately-evolving facets. The first facet is identification through the Uniform Resource Identifier/Locator. The second facet is what we might traditionally view as protocol: HTTP. The definition of HTTP is focused on transfer of data from one place to another through standard interactions. The third facet is the actual data content that is transferred, such at HTML.

Decoupling these facets ensures that it is possible to add new kinds of interactions to the messaging system while leveraging existing identification and content types. Likewise, new content types can be deployed or content types be upgraded without compromising the integrity of software built to engage in existing HTTP interactions.

In a traditional Object-Oriented definition of the protocol these facets are not decoupled. This means that the base-class for the protocol has to keep expanding when new content types are added or entire new base-classes must be added. The configuration management of this kind of protocol as new components are added to the architecture over time is a potential nightmare. In contrast, the Web's approach would mean that the base-class that defines the protocol would include an "Any" slot for data. The actual set of data types can be defined separately.

Object identification must be free to evolve

Object identification evolves on the Web primarily through redirection, allowing services to restructure their object space as needed. It is an important principle that this be allowed to occur occasionally, though obviously it is best to keep it to a minimum.

New object interactions must be able to be added over time

The HTTP protocol allows for new methods to be added, as well as new headers to communicate specific interaction semantics. This can be used to add new ways to transfer data over time. For example, it allows for subscription mechanisms or other special kinds of interactions to be added.

New architecture components can't assume new interactions are supported by all components.

Prefer low-semantic-precision document types over newly-invented document types

I think this is one of the most interesting lessons of the Web. The reason for the success of the Web is that a host of applications can be added to the network and add value to the network using a single basic content type. HTML is used for every purpose under the sun. If each industry or service on the Web defined its own content types for communicating with its clients we would have a much more fragmented and less valuable World-Wide-Web.

Consider this: If you needed a separate browser application or special browser code to access your banking details and your shopping, or your movie tickets and your city's traffic reports... would you really install all of those applications? Would google really bother to index all of that content?

Contrary to perceived wisdom, the Web has thrived exactly because of its low semantic value and content. Adding special content types would actually work against its success. Would you rather define a machine-to-machine interface with special content types out to a supplier, or just hyperlink to their portal page? With a web browser in hand, a user can often integrate data much more effectively than you can behind the scenes with more structured documents.

On the other hand, machines are not as good as humans at interpreting the kinds of free-form data that appear on the Web. Where humans and machines share a common subset of information they need the answer appears to be in microformats: Use a low-semantic file format, but dress up the high-semantic-value parts so that machines can read it too. In pure machine-to-machine environments XML formats are the obvious way to go.

In either the microformat or XML approaches it is important to attack a very specific and well-understood problem in order to future-proof your special document type.

Ignore parts of content that are not understood

The must-ignore semantics of Web content types allows them to evolve. As new components include special or new information in their documents, old components must know to filter that information out. Likewise, new components must be clear that new information will not always be understood.

If it is essential that a particular piece of new information is included and understood in a particular document type, it is time to define a new document type that includes that information. If you find yourself inventing document type after document type to support the evolution of your data model, chances are you are not attacking the right problem in the right way.

Be cautious about the use of namespaces in documents

I take Mark Nottingham's observation about Microsoft, Mozilla, and HTML very seriously:

What I found interesting about HTML extensibility was that namespaces weren’t necessary; Netscape added blink, MSFT added marquee, and so forth.

I’d put forth that having namespaces in HTML from the start would have had the effect of legitimising and institutionalising the differences between different browsers, instead of (eventually) converging on the same solution, as we (mostly) see today, at least at the element/attribute level.

Be careful about how you use namespaces in documents. Consider only using them in the context of a true sub-document with a separately-controlled definition. For example, an atom document that includes some html content should identify the html as such. However, an extension to the atom document schema should not use a separate namespace. Even better: Make this sub-document a real external link and let the architecture's main evolution mechanisms work to keep things decoupled. Content-type definition is deeply community-driven. What we think of as an extension may one day be part of the main specification. Perhaps the worst thing we can do is to try and force in things that shouldn't be part of the main specification. Removing a feature is always hard.

New content types must be able to be added over time

HTTP includes the concept of an "Accept" header, that allows a client to indicate which kinds of document it supports. This is sometimes seen as a way to return different information to different kinds of clients, but should more correctly be seen as an evolution mechanism. It is a way of supporting clients that only understand a superseded document type and those that understand a current document type concurrently. This is an important feature of any architecture which still has an evolving content-type pool.

Keep It Simple

This is the common-sense end of my list. Keep it simple. What you are trying to do is produce the simplest evolving uniform messaging system you possibly can. Each architecture and sub-architecture can probably support half a dozen content types and fewer interactions through its information transport protocol. You aren't setting out to create thousands of classes interacting in crinkly, neat, orderly patterns. You are trying to keep the fundamental communication patterns in the architecture working.

Conclusion

The Web is already an always-on architecture. I suspect that always-on architectures will increasingly become the norm for architects out there. There will simply come a point where your system is connected to six or seven other systems out there that you have to keep working with. The architecture is no longer completely in your hands. It is the property of the departments of your organisation, partner organisations, and even competitors. You need to understand the role you play in this universal architecture.

The Web is already encroaching. Give it ten more years. "Distributed Software Architecture" and "Web Architecture" will soon be synonyms. Just keep your head through the changes and keep breathing. You'll get through. Just keep asking yourself: "What would HTML do?", "What would HTTP do?".

Benjamin

Tue, 2007-May-29

Vista Embarrassment

I recently purchased a Toshiba Satellite laptop, model A200/L01. A few days ago my spouse downloaded a number of AVIs, which she and I wanted to burn to DVD so that we could watch them on the big screen. I spend most of my time in Ubuntu Feisty, so almost by instinct booted to Windows Vista to get this multimedia-related task done.

Shame, Microsoft. I can't believe your latest and best operating system lacks the codecs to play a random file from the internet. If there is one thing I was sure a Microsoft system could do better than a Linux one it is handling videos. While the sound worked in Vista, the video did not.

Booting back to Ubuntu, the video played. After a prolonged but finite period I was able to successfully produce a transcoded and burned DVD.

Everyone involved in the Vista project should cringe. I'm a bit of a Linux weenie, but Microsoft has always earned its props in a few definite areas. Now it looks like one of those areas has been lost. What is left? Microsoft's office suite is still a definite forerunner, despite the launches of openoffice.org and other assaults against it. Microsoft still has some friendly programming environments, though that is under pressure from web-based software delivery models.

Beryl is much nicer than Vista's new window manager. The Ubuntu start-up sound is much nicer than the dodgy new Vista startup sound. In all, I much prefer the Ubuntu environment. Props to the Debian underpinnings for this well-packaged distribution. Props to the gnome desktop that makes the production of a real friendly desktop environment possible. Props to everyone involved supporting hardware, from the kernel upwards. Feisty is an excellent distribution. Windows has been absolutely incapable of keeping up on these basics.

Benjamin

Thu, 2007-May-17

Simplifying Communication

Udi Dahan quotes an email I sent him some time ago when I was trying to get to grips with the fundamentals of SOA in contrast to the fundamentals of REST. He refers to it in a corresponding blog entry: Astoria, SDO, and irrelevance

I concur that adding a REST-like front end to a database isn't a particularly useful thing to do. HTTP is not SQL. It doesn't have transactions. Attempts to add them are unRESTful by the definition of REST's statelessness constraint, or at least to be approached with caution. Udi says that getting data out the REST way is fine... but updating it using a PUT requires a higher level of abstraction. Where I differ from Udi is that he says a higher level of abstraction is required than PUT. I suggest that a PUT to a resource that is pitched at a higher level of abstraction is what is usually required.

Let's take an example. You have a database with a couple of tables. Because we are in a purely relational environment, our customer information is split across these tables. We might have several addresses for each customer, lists of items the customer has bought recently, etc.

Exposing any one row or even a collection of rows from any one of these tables as a single resource is frought with problems. You might need to GET several aspects of the customer's data set in order to form a complete picture, and the GETs could occur across transaction boundaries. You will very likely one day end up with a data set that is inconsistent.

PUT requests to such a low-level object also run us into problems. Any update that requires multiple PUT requests to be successful runs the risk of leaving the database in a temporarily- or permanently- inconsistent state.

The answer here is to raise the level of abstraction. We could introduce transactions to our processing, but this increases complexity and reduces scalability. While it may be the right approach in many situations, it is usually better in client/server environments to expose a simplified API to clients. We don't really want them to know too much about our internal database structure, so we give them a higher-level abstraction to work with.

In this case the starting point would likely be the creation of a customer object or customer resource. In the SOA world where methods and parameter lists are unconstrained, we might have a getTheDataIWantForThisCustomer method and corresponding updateThisDataIHaveForThisCustomer method. In REST, you would do pretty much the same thing. Except in REST, the methods would be GET and PUT to a https://example.com/thecustomer URL of a widely-understood content type.

So which is better? I would suggest that the REST approach is usually the best one. It can take a little time and research to come up with or to adopt the right content type, but you will be set up for the long-term evolution of your architecture. In the SOA world you'll need to change your baseclass eventually, leading to a proliferation of methods and parameter lists. In the constrained REST world we use well-understood mechanisms for evolving the set of methods, urls, and content types independently.

In the end, REST is very much like SOA. Whatever you are about to do in your SOA you can usually do the same thing with REST's standard messaging rather than by inventing new ad hoc messages for your architecture. Your REST architecture will evolve and perform better, and require less code to be written or generated on both the client and server sides of your interface. For me, the fundamental constraint of REST is to work towards uniform messaging by decoupling the method, data, and address parts of each message. Most other constraints of REST (such as statelessness) are good guidelines that any architect should instinctively apply wherever they are appropriate, and nowhere else.

While we are not using the same terms and are not applying technology in the same way, I don't think that Udi and I are thinking all that differently.

Benjamin

Fri, 2007-Apr-06

Enterprise REST

I am not taking this seriously at all, but I have felt the need over the last few weekends to spend an hour at a time putting a summary of my REST thinking together. I have only made a very basic start at present, and don't really intend to push the whole thing to any kind of logical conclusion. However, here it is for those who might be interested:

Enterprise REST

At present I am just treating it as an extended blog entry and am not looking for a publisher. Unfortunately, I don't think I have the time to devote to making this book happen for real. However, expect the occasional update over time.

An excerpt:

A number of characteristics distinguish an enterprise architecture from simply an architecture for a business:

  • Different parts of the architecture are owned by different parts of an organisation, or by completely different organisations
  • Whole-architecture upgrades are technically or socially infeasible
  • Architectural decisions are made by consensus between participating organisations
  • Downtime for many or all architecture components is costly
  • The architecture must yield high performance compared to asset costs, real estate, power, network bandwidth, administration, and other expenses
  • The architecture is distributed geographically, introducing bandwidth and latency limitations

The characteristics I have chosen to define an enterprise architecture are also characteristics of the World-Wide Web.

Benjamin

Sun, 2007-Mar-25

More Deprecating POST

My recent post on deprecating the POST HTTP method for resource creation generated a few responses: Azubuko Obele, Peter Williams (1) (2), and Stefan Tilkov.

DELETE

First let me clarify for Azubuko Obele that I am not experimenting with deprecating DELETE at the moment. I have some questions about the contrasting concepts of a resource not existing and a resource having the null state, but they can lie for the moment. DELETE is part of HTTP and established REST practice. It has properties that are as good as those of PUT, so there is no real impetus to give it the boot. That part of my discussion was more of a prod to readers: A challenge to justify the existence of every method in use. I have not shaken the idea that DELETE is effectively a PUT of null, but there is really no point in actually practically replacing one with the other at this stage.

Stateless Redirection

On the PUT for resource creation front, Peter explains the problem better than I did originally and has an alternative proposal. The client would PUT to a well-known resource, rather than generating a GUID. That well-known resource would always redirect to a URL of its choice, and the client would follow.

There is a possible flaw in this approach which I didn't cover in my earlier post, and that is the potential for statefulness in the redirection. Consider a server that receives a PUT request to the well-known URL. It says: OK. The next available id is "3", so I'll redirect to there. The next request comes in, and it gets a "4".

It is quite possible that the "3" request will never get through. That means that the server should not prepare any data with that identifier, but should still increment a counter to assign the next id. That's to allow for the later "4" request to come in and be processed properly. It means that any servers within a cluster need to work with a shared counter or counter structure that allows unique identifiers to be assigned despite failover events, parallel processing between servers, and other disruptions. Most databases will have way of making these assignments built in, so it may be possible to use that mechanism. You would just have to make sure that the mechanism would work even when you are not creating actual records, or that any dummy records you create are eventually cleaned up.

What I was thinking originally was that any redirection would be stateless, and would probably have to retain the client-supplied GUID. For example, the server might convert the request URL of <https://example.com/factoryResource?guid=76fd9473-a270-4aac-8a06-e5265048cbbc> to <https://example.com/blogposts/76fd9473-a270-4aac-8a06-e5265048cbbc> almost as a regular expression match and replace, or might use identifying input from the request body to determine the correct request URI. In either case it could then immediately forget about the request. It did not assign an id, so doesn't need to increment any kind of counter. This is a bit ugly. I work in a machine-to-machine architecture where this is possibly less of an issue, but it might be better even in this environment to use readable identifiers. Debugging is an important part of the development process, after all.

Bait and Switch

So what is the alternative? One I perhaps missed describing earlier is to create the resource at the client-specified URL, including GUID, but also make the resource state available at another URL:

PUT /factoryResource?guid=76fd9473-a270-4aac-8a06-e5265048cbbc HTTP/1.1
Host: example.com
Content-Length: ....

<PurchaseOrder>
...
</PurchaseOrder>

The server may respond with:

HTTP/1.1 303 See Other
Location: https://example.com/newResource
...

Unfortunately, See Other isn't a good match. It doesn't carry the semantics we area really after. See Other only suggests that the client GET the nominated URL in order to complete the request. It doesn't indicate what the nominated URL actually means. 2616 demands that if we want to apply a PUT to another resource that 301 MUST be used. However, standards are meant to be bent. We could certainly pull a bait and switch on the client: Let the PUT go through with a simple 201 response, then redirect any later request to the new resource. All of these approaches are a little untidy, though.

Server-provided URL

This brings us back to the question of where the identifier comes from in the first place. If the client is operated by a human, it is likely the URL comes from a server-generated form. In this case no redirection will be necessary. The server should have chosen a good identifier from the get go. It is only when the client has to come up with the id on its own that we get into this redirection tizz. That is the case where the client and server have had no previous contact, except for the user of the client configuring a base URL into the client application.

Coupling Effects

Peter is concerned that I am creating coupling by requiring a particular way of constructing the real URL from the configured base URI in the client. Peter, you slightly misquote my example. I deliberately used the query part of the request URI as the place my GUID goes. My suggestion is that this is how all servers should accept this kind of PUT. When the client is configured with a base URI and has to determine the request URL from the base, that it fill out the query part of the base URI in a standard way. I suggest that the query parts of URIs should be compatible between servers to support non-human REST architectures. I suggest that all servers accepting this kind of PUT use or support the ?guid={GUID} syntax. Coupling, yes, but coupling to a standard approach.

Conclusion

I am continuing along with my experiment for now. Clients start with a configured base URI, and construct a URL by filling out the query part with a GUID. My servers aren't doing any redirection. In the future I may consider making the same object available under another URI without redirecting the original client.

Benjamin

Sun, 2007-Mar-18

Deprecating POST

Following recent discussion on rest-discuss, I have been experimenting with deprecating the POST method for architectures my organisation controls, and am eyeing DELETE with some suspicion also. I am suggesting that PUT be used wherever possible.

Pendulums and Muddy Waters

I think there is a pendulum in method definition. Every so often you think you need a few more methods to do special things, but the time comes to swing back and consolidate. GET is a clear winner both in the REST and the non-REST camps, but argument rages over use of the other classic REST verbs. POST is particularly problematic because of its non-idempotent nature. If my POST request times out, I can't just try again. I might be submitting my request twice.

I see value in trying to reduce the set of methods down to GET and PUT. I think it could clear the waters a little and result in simpler architectures. At the same time I acknowledge that asking the question can be seen to be muddying the waters in the short term. I hope I don't trigger too much gnashing of teeth with this suggestion, nor with a suggestion I may make soon about cutting down the HTTP response codes to a "useful" subset. I acknowledge also that I might be pushing some anti-URI-construction buttons with this article. However, I see that also as an issue that needs some clarity and guidelines going forwards. Ultimately, I hope to draft some documents for my company's internal use about the subset of HTTP they should implement and about other aspects of RESTful design. I hope that such documents could eventually be made available to a wider audience, also.

The proposal

My current direction is to combine PUT with client-side "form submission". The classic POST request might be as follows:

POST /factoryResource HTTP/1.1
Host: example.com
Content-Length: ....

<PurchaseOrder>
...
</PurchaseOrder>

The server may respond with:

HTTP/1.1 201 Created
Location: https://example.com/newResource
...

In my experimental approach I require the client to provide a globally unique identifier as part of the request URL. An example request might be:

PUT /factoryResource?guid=76fd9473-a270-4aac-8a06-e5265048cbbc HTTP/1.1
Host: example.com
Content-Length: ....

<PurchaseOrder>
...
</PurchaseOrder>

The server may respond with:

HTTP/1.1 201 Created
...

My architecture is highly focused around machine-to-machine comms, so I expect that most servers will simply create a resource at the location the client requests (including the query part). The actual construction by a client of a request url from the base factory url is anticipated to be standard, which is to say "hard-coded". Every factory resource should become a factory resource family with the same query structure. I am following the direction laid down in one of my recent articles of suggesting that the query component of a url constructed by clients should follow content-type rules of having consistent structure between similar services.

Many services will choose to use their own unique identifiers for the created resources. For example, the guid could be substituted for a monotomically-increasing number or other identfier within a particular context. These services should redirect the PUT request to their preferred url:

HTTP/1.1 301 Moved Permanently
Location: https://example.com/newResource
...

Note that there is still a caveat when attempting to POST/PUT Once Exactly (POE). It is important that created resources are not too short-lived. If they are short-lived the server should keep track of the fact they recently existed in order to inform the client that is creation attempt was successful by way of a 410 Gone response to subsequent PUTs. A similar problem arises if the state of the resource is updated soon after resource creation. The client who thinks the creation request failed could blat over the changed state. Perhaps there still does need to be some separation between a method that creates state at a particular url and one that replaces state. Perhaps there does need to be separation between a CREATE method and an UPDATE method. Too late now for HTTP on the Web, perhaps.

Conclusion

Pub/Sub aside, GET is the right way to transfer state from server to client. However, the waters are muddier when transferring from client to server. I suggest that special applications of PUT may be more useful than defining separate methods. PUT, POST, and DELETE currently mean "create or replace state where I tell you", "create state wherever you like", and "eliminate state" or "replace with the null state". My gut suggests that collapsing all of these methods into PUT is a net win. My only hesitation is in the misuse PUT might see if the "deprecate POST" message gets out too far. Therefore, I suggest PUT be used where state is being transferred idempotently. POST should still be used where tunnelling or "process this" semantics are required. POST should be a warning for any code that sees it to tread carefully: The guard ropes have been cut.

Benjamin

Sat, 2007-Mar-17

The REST Triangular Pyramid

I have postulated before about uniform interface consisting of a triangle of identifier type, method or interaction type, and content type. Perhaps it would clarify matters for the camp if another dimension were added.

Semantic Confidence

At WSEC there were a number of people in the room worried about the lack of semantics of POST. An old question kept coming up: Could a client accidentally trigger a harmful operation by accident? Perhaps the launch of a nuclear weapon?

Security issues aside, the question can be posed as: "How can the client be sure that its request is understood?", or "How can the client be sure that the server will do what it wants the server to do?". The question is posed around POST, but the same can be asked for PUT or even safe methods such as GET.

A PUT of text/plain "1" could save the result of a calculation into a loan amortisation tool, or could change the US defence condition and set in motion countermeasures that launch the world's nuclear missiles. However, suggesting that the client should have coded knowledge of which case will occur is missing the point. There are many cases of URLs between the benign and the disastrous for which it is useful to configure a client to send its text/plain "1". Instead of saving data into a loan amortisation tool, perhaps the data can be stored in an excel spreadsheet. Perhaps a rival's tool will one day be preferred. Just how much of the server-side semantics should be encoded into the message?

Just as the same PUT request could mean different things when directed to different resources, GET means different things to different resources. If I GET https://google.com/ I expect a form to do web searching. If I GET https://en.wikipedia.org/ I expect a navigable encyclopedia. In the pure machine-to-machine world I might point a calculations engine at a resource demarcating the state of a circuit breaker, or a water pump, or a fan. Depending on the calculation these may all be valid configurations.

Interoperability

REST is about interoperability, and at its heart we see the answer to our puzzle of where the resource-specific semantics "go" in a message. They don't go. They are held implicitly in the URL, a URL that fundamentally a human has provided to the client. Whether the URL was provided by explicit configuration, by hyperlinking from other documents, or any other source... there is a reason this client is making this request to that resource at this time. A human has made all of the necessary logical connections. It is the machine's job to get out of the way of this human decision.

So taking GET as the guide we can draw the same implications for PUT, DELETE, and even our scruffy old friend POST. It is up to the resource to determine how to interpret the request, so long is it does so within reason. We don't introduce unnecessary coupling just so the client feels like it has more control over the situation. Feed as many semantics down the pipe as you like. The server still decides how to interpret your request. All you are doing by encoding this information directly is preventing interoperation with other resources.

Human-directed Interconnection

So the REST triangle stands, firm and proud. One set of schemes for identifiers in the architecture, one set of interactions that components in an architecture participate in, and one set of content types exchanged within an architecture. Every message is a permutation of these separately-evolving facets of the message, and can be correctly handled based on these facets. Additional semantics never leave the server, however they are identified by a URL. Each URL is transferred from server to client in preparation for any request under human direction, and is transferred back to the server as the request URL. Humans are free to connect components together when it makes sense to do so, and coupling at the technology level does not get in the way of that human direction.

The only thing that should get in the way of free human connection of components is when the interaction is not meaningful for a particular resource, or the schema of data transmitted and required between components does not match. In these cases it is possible to introduce configuration errors by mismatching client and server. However, this is the least of your worries when you are configuring components to talk to each other. Again, we have the amortisation calculator vs defence condition case. A human must have enough information at hand to avoid confusing the two. The core problem remains knowing which resources are out there, and conveying that information to the human along with human- or machine- readable semantics as to what will occur when a request is sent to a particular url.

While it may be useful to introduce a family of languages that helps guide a human through the configuration process, only the final selected url should actually be transmitted within the operational architecture. "getStockQuote" should never appear in the message, nor "getUnitPrice", nor "getCurrencyAmount". Simply using "GET" with the appropriate URL is the right way to ensure that as many clients as possible who want to access the available information can access it.

Conclusion

It is easy to fall into the trap of thinking that a more explicit client request is more valuable or more certain than a uniform request. However, while this approach necessarily introduces negative coupling it does not introduce a strong balancing positive effect in terms of semantic confidence. All of that confidence comes from an out-of-band trust relationship with the URL a request is directed to, not from the message itself. If a message that uses uniform identifiers, uniform interactions and uniform content types can convey the same information it will always be more valuable to go down the uniform path. Where no uniform message can convey the information you want, REST still has value. It will still be more valuable and cost-effective to change a single REST triangle facet than to reinvent the wheel and start from scratch.

Benjamin

Wed, 2007-Mar-14

Machine-to-Machine Forms Submission in REST

URI-construction is usually seen as a bad thing in REST practice. However, just about everyone does it in some form or another. You start with some sort of base URL, and you fill out variable parts of it to construct a URL that you actually look up or submit data to. We refer to experience on the Web and say that it is alright to construct a URL sometimes. In particular, it is OK to construct a URL when you have a document that you just obtained from the server that tells you how to do it. This kind of URL construction isn't really a problem. In fact, it is just an advanced form of hyperlinking. So what are the limits of this form of hyperlinking when there is no human in the loop?

Form Population for Machines

The distinction between a human-submitted form and a machine-submitted form is an important one. A form intended for a human will include textboxes and other user-interface widgets alongside supporting text or diagrams that explain how to fill the form out. A machine cannot understand the kinds of loose instruction given to a human, so we have to consider how a machine knows what to put where in the URL.

I think that the simple answer is that the input to any machine-populated form is effectively a standard document type, or at least a standard infoset. The machine that constructs a URL does so from a particular set of information, and the form acts as a transform from its infoset into the output URL.

For example, a machine that is constructing a query for the google search engine must know to supply a google-compatible search string in the "q" field of the form. A client of yahoo must currently know to supply a yahoo-compatible search string in the "p" field. While humans are able to fill in forms that accommodate these differences, machines are more limited. If we are ever to have any hope of machine-submitted forms we will need to look at reducing this kind of deviation in infoset structure.

Transformations from Infoset to URL

This characterisation of a machine-populated form as a transform should impact how we go about selecting and evaluating forms technologies that are destined purely for machines to populate. In particular, XSLT jumps right up the list in terms of possible technologies. It is already a well-established transform technology that can output text as well as other structured document types. If we view the source document as either XML or an XML infoset, XSLT provides an obvious way to achieve the URL construction goal.

Another approach would be to look carefully again at how Web forms work. In practice, they construct an infoset based on user input then use a defined mechanism to transform the infoset into into the query part of a URL. This could be a significantly simpler approach than even embedding an XSL engine. Let's say we see the source document as an XML infoset again, we can follow the rules that XForms defines for transforming an XML document into a query. These rules are essentially that elements with text nodes are turned into URL parameters.

Coupling Effects

On first blush this standard transform approach looks like it couples client and server together, or requires different server implementations to obey the same rules for their URL-space... and that is not entirely false. The factors that limit these effects are the use of a standard document type as input to the transformation, and the ability for server-side implementations to redirect.

In the client-to-server coupling case, we often see service-specific URL construction occurring in clients today. Instead, this construction should be able to be applied to different services that have similar construction requirements. I should be able to start up a search engine rival to google and use the same infoset as input to my URL construction. The client code should accept a base URL as input alongside other parameters that form the infoset, meaning that all clients need to do to use my rival service is change their base URL. Code changes should not be required.

In the server-to-server coupling case, this is an interesting problem. We usually see content types and methods needing to be standard, but give freedom to servers to construct their URI-spaces in whatever way they see fit. The XSLT form submission method would give them that freedom up-front, however redirection is also a way of achieving that freedom. A simple 301 Moved Permanently would allow the server freedom in how they construct their URLs. Greater freedom, in fact, than XSLT running in a client implementation could because it would have more information at its fingertips with which to decide on the redirection URL. To achieve this, all we really need to sacrifice on the server side is a single base URL with a query space that matches a standard infoset that machine clients can be expected to have at hand ready for submission.

Conclusion

My considered view is that using the query part of a URL as a way to pass a standard infoset to a server is a valid way of constructing URLs. I think it is the simplest and most robust way to transform an infoset into a URL, and possibly the most powerful. Current attempts to allow the server greater freedom as to how it constructs its URLs are interesting, but at this point I do not intend to implement anything but the query-part-as-content approach in my development. I think the focus should shift away from this technical sphere of URL construction to a process of defining the content types that are fed into these transforms.

Benjamin

Sun, 2007-Mar-04

You are already doing REST

REST is often strongly correlated with HTTP and its verbs. It is often contrasted with a SOAP or WS-* services as two opposing technologies or approaches. I take more of a middle ground. I think that you are already doing REST. The fundamental questions in the development of your network architecture are not necessarily whether or not you should be doing REST, but specifically what benefits do you intend to extract from the development. Let me run you through how I see this thing playing out.

Your messages are already uniform

Working from first principles with the constraints, REST says that you should exchange a constrained set of message types using standard methods and content types. So you have your IDL file or your WSDL, and you have a number of methods listed in that file. If you are using the document-oriented style of SOA your WSDL will no doubt include or refer to the definition of a set of documents. In other words, your WSDL defines the scope of your web. Everything in that web... everything that implements the WSDL either as a client or as a server... can meaningfully exchange messages. These components of your architecture can be configured together. They don't need to be coded. A human can decide to plug one to the other arbitrarily without the technology getting in the way.

But the technology is getting in the way.

Your uniform methods aren't achieving network effects

You have defined this web, this WSDL, this architecture... but it is too specific. You can only connect the two components together that you designed the interface for, or you can only connect the client apps to the server that you designed the interface for. It isn't a general mechanism for letting a client and server talk to each other, because the problems of that particular interaction are built into your web in a fundamental way that makes solving other problems difficult.

That's ok, isn't it? If I want to solve other problems I can create another WSDL. I can create another web. Right?

You can, and sometimes that is the right approach. However you impose a cost whenever you do that. You can only plug components together if they are on the same web. You can only plug them together if they share a WSDL. Otherwise you have to code them together. Most of us have been writing code whenever we want two network components to talk to each other for so long that we assume there is no alternative. However, I come from the SCADA world and an increasing number of competent people come from the Web world. Experience in both of these worlds suggests we can do better. But how much better, exactly?

In an ideal world...

The ultimate ideal would be that so long as two machines have the same basic data schema and any particular interaction makes sense, that they can be configured to engage in that interaction rather that requiring us to write special code to make that interaction happen. However, is this practical? What is achievable in practice?

The Web sets the benchmark by defining separately the set of interactions machines participate in and the et of document types they can exchange. The three components of what make up our messaging change and evolve at different rates, so separating them is an important part of solving each of these important problems.

  1. How we identify participants in an interaction, especially
    • request and response targets
  2. What interactions are required, including
    • Request Methods
    • Response Codes
    • Headers
    • Transport Protocol
    • TCP/IP Connection direction
  3. How information is represented in these interactions, including
    • Semantics
    • Vocabulary
    • Document structure
    • Encoding (eg XML), or Serialisation (eg RDF/XML)

Whether or not you can actually achieve consensus on all of these points is a difficult question, and usually limited by non-technical issues. You really need to hit an economic sweet spot to achieve wide-scale consensus on any part of the trio. Luckily, consensus on identification and interactions is widely achieved for general-purpose problem sets. Special uses may need special technology, but URLs and HTTP go a very long way to closing out this triangle of message definition. The remaining facet is perhaps the hardest because it requires that we enumerate all of the different kinds of information that machines might need to send to each other and have everyone in the world agree to that enumeration.

So this is the limiting factor of the development of the Semantic Web, a web in which rich semantics are conveyed in messages that are understood by arbitrary components without having to write special code. The limiting factor is the number of kinds of information you can achieve global consensus on. However, we don't really need to have global consensus for our special problems. We only need consensus within our little web. We just need to understand the messages being exchanged in a particular enterprise, or a particular department, or a particular piece of IT infrastructure. We just need the components of our web to understand.

Closing the triangle: Content Types

So if the web relies on special understanding of this final facet, what is the point of agreeing on the first two? The answer to that, my friends, is evolution. What is special today might not be tomorrow. I can develop a format like atom to solve a specific set of problems within my web, and then promote my format to a wider audience. The more components that implement the document type, the wider my web and the bigger the network effects. The other two facets already have widespread consensus, so I can target my efforts. I can avoid reinventing solutions to how objects are named or how we interact with them. I can just focus on the format of data exchanged in those GET, PUT, and POST requests. The rest is already understood and known to work.

Now that's all well and good. The Semantic Web will evolve through solutions to specific problems being promoted until individual webs that solve these problems are joined by thousands of components operated by thousands of agencies. But... what about me? What about today's problems? Most of my document types will never leave the corporate firewall, so is there still and advantage in considering the Web's decomposition of message type?

I suggest, "yes". Whenever you integrate a new component into your network, do you need to write code? When new protocols are defined, are the easy to come to consensus on? As an integrator of components myself I find it useful to be able to fall back on the facets of message type that are widely agreed upon when new protocols are being defined. We don't have to go over all of that old ground. You and I both know what we mean when we say "HTTP GET". Now we just have to agree on specific urls and on content types. Chances are that I have a content type in my back pocket from similar previous integrations or that I can adapt something that is in use on the wider Web. Any message exchanges that could use pure Web messaging does so, and anything that needs special treatment gets as little special treatment as possible.

Certainly, after a few years of doing this kind of work it gets easier to build bridges between software components.

Vendors and Standards

Unfortunately, this sort of gradual evolution and interactions between the wider webs and your special case are not well supported by WS-*. Hrrm... this is where I find it hard to continue the essay. I really don't know WS-* well enough to make definitive statements about this. What I can do with HTTP is easily add new methods within the namespace of the original methods. I can then promote my new methods for wider use, so for example I can promote a subscription mechanism. In XML I could add extensions to atom, and if I used the atom namespace for my extensions they could eventually be adopted into atom proper without breaking legacy implementations of the extensions. Can the same be said for WS-*? Does it allow me to separate the definitions of my methods and my document types effectively for separate versioning? Do the tools support or fight these uses of WS-*?

For that matter, do the tools associated with XML encourage must-ignore semantics that allow for gradual evolution? Do they encourage the use of a single namespace of extensions, with alternative namespaces only used for separately-versioned sub-document types such as xhtml within atom? My tools do, but they are all written with the architectural properties I require in mind. Does the world and do vendors really understand this communication problem? Do they understand the importance of limiting the number of kinds of messages that exist in the world? Are they taking practical steps to make it easier to reuse existing messages and message structure than to create incompatible infrastructure? Do programmers understand the social responsbility that defining new message types places on them?

Simplicity: Architecture or toolchains?

REST is sometimes bandied about as a simpler approach than WS-*, and certainly REST architectures are simpler. They have less variation in message types, and promote configuration of components rather than coding to make new interactions work. However REST only achieves this by shifting the social problem out of the software. Instead of solving the problem of how two components interact with one-off software, we have a long multi-party standardisation effort that ensures thousands can interact in this way. REST encourages the reuse of existing messages and structure, but in truth it is often easier to add a new message type simply by throwing an extra method into your WSDL. REST results in less code and simpler architectures. SOA results in more code and more complex architectures... but the difference isn't between HTTP and SOAP. It is between investment in an architecture and investment in a toolchain.

Perhaps that is the take-home, ladies and gentlemen: You can achieve better and simpler and higher-value architectures... but even when you leverage standard identification schemes and interaction models there is no silver bullet. You still need to choose or evolve or invent your document types wisely. That costs time and money, as does promoting anything you do invent to sufficient scale to achieve significant value. That effort has to be judged against the value of the network to you and your company. On the other hand, I think we are lacking the tools that make some of these problem easy to identify. I think we can make it easier to build new infrastructure based on the consensus we have already achieved. I think we can do better.

Conclusion

You are already doing REST, but are you getting the network effects that you see on the wider Web? Can a million different software components carry on perfectly sensible conversations with others that they have never met before and had no special code written for? Can you do REST better? Is it worth the extra effort, and what tooling do we need to put the evolvable Semantic Web within the reach of mere mortals?

Benjamin

Tue, 2007-Feb-27

The Architectural Spectrum

I see the ideal software architecture of the world as a spectrum between the widest and most narrowly-defined. Sitting at the widest end today is the architecture of the . Sitting at the narrowest end is code being written as part of a single program. But what are steps between, and how should our view of architectural constraints change between the two extremes?

Architecture and Architecture Spectrum

Firstly, let me define for the purposes of this article. An architecture consists of components that communicate by exchanging messages in interactions. A single component can participate in multiple architectures of differing scales, so we need a way of distinguishing one architecture from another in a fundamental way. I suggest using , which states that the value of a telecommunications network is proportional to the square of the number of users in the system. In software architecture the network effect is bounded to sets of components that each understand a common set of interactions. Each interaction consists of one or more messages being sent from component to component, and each message in the interaction is understood by its recipient.

A specific software architecture is a collection of components that each understand the same set of messages. Messages that are not understood by some component are part of a different architecture. A spectrum of architectures typically involves a base architecture in which a small number of messages are understood by a large number of components, then successively narrows into sub-architectures that understand a more diverse set of messages. While some components may participate in several sub-architectures, we can conceptualise the spectrum from any particular component's point of view as a set of narrowing architectures that it participates in. Metcalfe's law does not apply when interactions are not understood between components, so whenever a new interaction is introduced a new architecture is created within the spectrum.

An Architectural Spectrum Snapshot

The largest software architecture in the world today is the HTML Web. Millions of components participate in this architecture every day, and each time a request is sent or a response returned it is understood by the component that handles the request message or response message. The Web is a beachhead for the development of architecture everywhere. The Web is actually a collection of architectures defined around specific document types. With HTTP as its foundation it defines a meta-architecture that can be realised whenever a particular document type gains enough support.

The components of the various Web architectures are operated by millions of agencies. The representatives of these agencies can be found in organisations such as the w3c and the ietf, and require a great deal of effort to move in any particular direction. Architectures operated by a smaller number of agencies may be more easily influenced. For example, participants in a particular industry or in a particular supply chain may be able to negotiate new interactions to build architectures relevant to them.

On a smaller scale again we might have a single agency in control of an architecture in an enterprise or a corporate department. These architectures are easier to influence than those that require the balancing of more diverse competing interests, however they may still be constrained. An enterprise architecture will still typically consist of separate configuration items... separate components that are not typically or perhaps can never be upgraded or redeployed as a unit. You generally can't just pull the whole system out and plug another one in without significant upfront planning and cost. The enterprise architecture consists of several configuration items that must continue to work with other architecture components post-upgrade.

That leaves the architecture of a configuration item. This is where I would stop using the word architecture and start using the word design. At every point until this one as we scale down from the Web we must normally be able to deal with old versions of components that we interact with. We must deal with today's architecture as well as yesterday's, and tomorrow's. This temporal aspect to architecture participation disappears for architecture defined within a particular configuration item. It becomes more important to ensure that a consistent whole is deployed than that a component continues to work with all architectures when it is upgraded.

Characteristics of the Spectrum

As we move down from the Web we see changes in a number of areas:

With the reduction in the number of participants, the momentum behind existing interactions declines. It is easier to add new interactions and easier to eventually eliminate old ones. Importantly, the network effects also decline. Many of the constraints of can be relaxed in these environments.

Network effects are still important, even in relatively small architectures. This means that it is still worthwhile following constraints such as the uniform interface. There is no point splitting your architecture up into point-to-point integration pairs when you could just as easily have ten or twenty components participating in an architecture and working together for the same cost. The main areas that REST constraints can be relaxed in involve scalability an evolvability, and even there you have something of a newtonian vs einsteinian issue. You may not see the effects of relativity when you are travelling at 60Kph, but they are there. Sure enough, when you really get up to speed at 1/4 the speed of light you'll know it. Every architect should be aware of the constraints and the effect of bending them.

Evolving the Spectrum of Architectures

One particularly interesting aspect of REST is how it evolves. I noted earlier that the Web is really a meta-architecture that is realised in the form of the HTML Web and other Webs. This is a characteristic of the REST style. Instead of deciding on the message format and leaving it at that, REST builds the messages up of several moving parts. The message consists of verbs (including response codes), content types, and identifier schemes. Each time you change the set of verbs, the set of content types, or the way you identify resources you are creating a new architecture. Different kinds of changes have differing effects. Changing the identifier scheme could be a disaster, depending on how radically you change it. Changing the set of methods will affect many components that are agnostic to content type changes. For example, a web proxy is not remotely interested in the content type of a message. It can cache anything. HTTP libraries are similarly agnostic to the messages they send, either in whole or in part.

REST is actively designed around the notion that while all messages need to be understood to be part of the same architecture, architectures must change and be replaced over time. New versions of particular document types can be easily introduced, so long as they are backwards-compatible. New document types can be introduced, so long as they don't require new methods. Even new methods can be introduced occasionally.

It is my view that the most appropriate way to create the smaller architectures of the world is to extend the base architecture of the Web. New methods can be added. New document types can be added. Existing document types can be extended. I think that while REST provides key technical facilities to allow architectures to evolve, there is also a human side to this evolution. Someone must try out the new content type. Someone must try out the new methods. I see these smaller architectures as proving grounds for new ideas and technology that the Web is less and less able to experiment with directly.

It is in these architectures that communities develop around the Web meta-architecture. It is in these architectures that extensions to standard document types will be explored. It is in these architectures that new document types will be forged. Community is the central ingredient to evolution. The most important thing that technology can do is avoid getting in the way of this experimentation. Once the experiments are gaining traction we need simple ways of merging their results back into the wider architecture. We need simple ways of allowing atom extensions to be experimented with and then rolled back into atom itself without introducing esoteric namespaces. In short, we need to be developing a Web where context defines how we interpret some information rather than universal namespaces. When these extensions are moved into the main context of the Web they will define architecture for everyone. Until then, they incubate in a sub-architecture.

Conclusion

I still don't see where fits into the world, or even for that matter. The expense of rolling out a new protocol over the scale of the Web has already been demonstrated to be nearly impossible over the short term. HTTP/1.1 and IPv6 are examples. The Web has reached a point where it takes decades to bring about substantial change, even when the change appears compelling. HTTP can't be unmade at this point, but perhaps it can be extended. So long as their use remains Web-compatible, sub-architectures can extend HTTP and its content types to suit their individual needs. They may even be able to build a second-tier Web that eventually supplants the original Web.

I don't see a place for . I see the Web as a world of mime types and namespace-free xml. I think you need to build communities around document types. I think the sub-architectures that (mis)use and extend the content types of the Web contribute to it, and that XML encourages this more than RDF does. Today we have HTML, atom, pdf, png, svg, and a raft of other useful document types. In twenty years time we will probably have another handful that are so immensely useful to the wider Web that we can't imagine how we ever lived without them. I predict that this will be the way to the semantic web: Hard-fought victories over specific document types that solve real-world problems. I predict that the majority of these document types will be based around the tree structure of XML, but define their own structure on top of it. I don't foresee any great number being built around the graph structure of XML, also defined on top of XML in present-day RDF/XML serialisations. If RDF is still around in that timeframe it will be used behind the firewall to store data acquired through standard non-RDF document types in a way that replaces present day RDBMS and SQL.

Benjamin

Sun, 2007-Feb-25

Remixing REST: Verbs and Interaction Patterns

I have been interested in the boundaries between classical object-orientation and REST for many years. This article attempts to explore the bounaries in one particular area. One of REST's core tenets is that of the uniform interface. Is the uniform interface as important as REST suggests? Could it be done any differently?

A significant proportion of the work that I do involves integrating software components or physical devices from different vendors into a single architecture. This usually involves writing a protocol converter for the purpose, often a one-off converter for a particular customer contract. Internally, we have needed to do this kind of thing less and less as we have embraced the REST style. Instead of inventing a new protocol or new IDL whenever we write a new application we have been tending for some time now to reuse an existing HTTP-derived protocol. We can then focus on document types. Do we need a new one, or will the one of the ones we have already in use do the job?

The need to limit verbs has long been a teaching of REST proponents, but the motivation isn't always abundantly clear. It seems we can look at the web and see that the nouns greatly outnumber the verbs, and see that the web seems to work well because of it. So let me have a go at coming up with a simple reasoning:

Ad hoc interoperation between two components of an architecture relies on those components being capable of participating in a particular common interaction pattern. An interaction pattern between a client and a server involves one or more request messages being sent from the client to the server, and one or more response messages being to the client. Today's Web constrains the interactions to one request and one response per interaction. The interaction is decomposed into request verb and document type, and the response verb and document type. Headers are sometimes also important parts of the interaction.

In traditional Object-Orientation we are used to writing code every time we write a new class or interface. We write new code to implement the classes, and write new code to interact with the classes. Two objects are unlikely to interoperate unless we plan for that interoperation. Interface classes and design patterns can help us decouple classes from each other, however we must still typically design and choose an interaction pattern for a specific functional purpose.

This is all well and good when we control both end-points of the conversation, or when the interface is encapsulated in an industry standard such as the servlet interface. However the Web introduces a broader problem set. We start to need an interface that decouples components from each other, even though they belong to different industries. We need standards that are more generic, lest we have to start writing new browser code every time a web site is added to the Internet.

Let's inspect the Web interaction pattern some more. We have roughly four to eight verbs to work with, with about... urgh... forty-three response verbs. That gives you around 172-344 possible request/response interactions on the web. You also need to multiply that out by the number of content types, so theoretically we have thousands or even tens of thousands of possible interations happening on the Web. That's probably too many.

In practice only a few response verbs are used, and hopefully Waka will make some sort of headway in this respect. If we reduced the response verbs to their basic types we would be left with only about twenty important interaction patterns on the Web, and only clients get the really raw deal. A server needs to understand all of the possible requests that make sense, but doesn't need to undersand any response that it doesn't plan on using. A client should understand all of the requests that are meaningful based on its own specification, but should also understand all of the responses that might be returned to it.

If client and server both implement the request and response verbs that make sense and they both know how to exchange the same document types, they should be able to be configured rather than coded to work together. This is hugely important in big architecture, where it is rarely possible to influence the other side of a conversation into following your individual, corporate, or even industry-specific specifications.

To my mind the difference between design and architecture is one of configuration control. At one extreme we have design. Design is controlled by a single agency, and deployed with a single version number. You can construct a design in a very freeform way, because you test and deploy it as a unit. It makes sense to maintain rigid control over typing. You would rather find inconsistency problems at compile time than have to pick them up during testing.

Pure architecture is the other extreme. An archicture component is deployed as a single entity, but when it is upgraded none of the other architecture components are redeployed. Consistency is no longer a concern, and checking for consistency is extremely counterproductive. It is much more important to interoperate with a range of components an component versions built and deployed by different agencies.

In the middle of these two extremes is a kind of half-design, half-architecture scenario. I'll call it system design. You might version or deploy different components of a system separately, but you own all of the components can do a big upgrade if you need to. System design has characteristics of both design and of architecture. Like architecture, you want to avoid enforcing consistency at build time between components. They might be deployed against various versions of the other components. Like design, you can add special interactions and local conventions. You control both ends of the conversation, so can be sure that your special conventions will be understood correctly.

Another way to look at system design is as a sub-architecture. Your system may participate in a wider architecture over which you have no control, in a smaller architecture over which you have some control, and yet another in which you have significant or total control. The ideal implementation of these architectures would use the interactions that are standard in the widest architecture whenever they are applicable, then scale down to specifics as special semantics are required. An example of this might be to use a HTTP GET request whenever a client wants to retrieve any kind of data from a server, but still allow special interactions such as LoadConfiguration when nothing from the HTTP sphere is a good match.

The widest possible architecture today is the Web, making HTTP and its methods hard to ignore. It seems they should be the defacto standard whenever they are appropriate. However, SOAP appears to be solving real-world problems in smaller architectures or designs today. The two are clearly not compatible on the wire, however gateways between the two protocols may be viable when a WS-*-based architecture facilitates interactions that can be cleanly mapped to HTTP. Two approaches are possible to create a mapping in a WS-* architecture. You could define a WSDL that covered HTTP-compatible interaction, or you could construct individual WSDL to deal with each interaction that HTTP supports. Given these interfaces it would be straightforward for components of the WS-* architecture to also participate in the wider architecture.

While gateways are a short-term technique that can be used to bring these architectures closer together, they don't really solve the longer-term issues. We should be prepared to identify a longer-term objective that allows the needs of both architectures to be met with a single technology set. This could be achieved by starting out every conversation as HTTP, but quickly upgrading to a more sophisticated protocol whenever it is supported. Fielding has suggested this will be a technique used by his Waka protocol, and it could likewise be adopted for a HTTP-compatible SOAP mechanism. However with both WAKA and SOAP the advantages of the new protocol would have to significantly outweigh the costs of effectively replacing the architecture of the Web. I see any such protocol as spending decades incubating in the enterprises of this world before they become remotely important components of the actual Web.

Benjamin

Thu, 2007-Feb-22

SCADA, Architectural Styles, and the Web

A position paper for the W3C Workshop on Web of Services for Enterprise Computing, by Benjamin Carlyle of Westinghouse Rail Systems Australia.

Introduction

The Web and traditional SCADA technology are built on similar principles and have great affinity. However, the Web does not solve all of the problems that the SCADA world faces. This position paper consists of two main sections: The first section describes the SCADA world view as a matter of context for readers who are not familiar with the industry; the second consists of a series of "Tier 1" and "Tier 2" positions that contrast with the current Web. Tier 1 positions are those that are based on a direct and immediate impact on our business. Tier 2 positions are more general in nature and may only impact business in the longer term.

The SCADA World View

Supervisory Control and Data Acquisition (SCADA) is the name for a broad family of technologies across a wide range of industries. It has traditionally been contrasted with Distributed Control Systems (DCS), where distributed systems operate autonomously and SCADA systems typically operate under direct human control from a central location.

The SCADA world has evolved to usually be a hybrid with traditional DCS systems, but its meaning has expanded further. When we talk about SCADA in the modern era, we might be talking about any system that acquires and concentrates data on a soft real-time basis for centralised analysis and operation.

SCADA systems or their underlying technologies now underpin most operational functions in the railway industry. SCADA has come to mean "Integration" as traditional vertical functions like train control, passenger information, traction power, and environmental control exchange ever more information. The demands of our customers for more flexible, powerful, and cost-effective control over their infrastructure are an ever-increasing set.

Perhaps half of our current software development can be attributed to protocol development to achieve our integration aims. This is an unacceptable figure, unworkable, and unnecessary. We tend to see a wide gap between established SCADA protocols and one-off protocols developed completely from scratch. SCADA protocols tend to already follow many of the REST constraints. They have limited sets of methods, identifiers that point to specific pieces of information to be manipulated, and a small set of content types. The one-off protocols tend to need more care before they can be integrated, and often there is no architectural model to be found in the protocol at all.

We used to think of software development to support a protocol as the development of a "driver", or a "Front End Processor (FEP)". However, we have begun to see this consistently as a "protocol converter". SCADA systems are typically distributed, and the function of protocol support is usually to map an externally-defined protocol onto our internal protocols. Mapping from ad hoc protocols to an internally-consistent architectural style turns out to be a major part of this work. We have started to work on "taming" HTTP for use on interfaces where we have sufficient control over protocol design, and we hope to be able to achieve Web-based and REST-based integration more often than not in the future. Our internal protocols already closely resemble HTTP.

The application of REST-based integration has many of the same motivations and goals as the development of the Semantic Web. The goal is primarily to integrate information from various sources. However, it is not integration with a view to query but with a view to performing system functions. For this reason it is important to constrain the vocabularies in use down to a set that in some way relate to system functions.

I would like to close this section with the observation that there seems to be a spectrum between the needs of the Web at large, and the needs of the enterprise. Probably all of my Tier 1 issues could be easily resolved within a single corporate boundary, and continue to interoperate with other parts of the Web. The solutions may also be applicable to other enterprises. In fact, as we contract to various enterprises I can say this with some certainty. However, it seems difficult to get momentum behind proposals that are not immediately applicable to the real Web. I will mention pub/sub in particular, which is quickly dismissed as being unable to cross firewalls easily. However, this is not a problem for the many enterprises that could benefit from a standard mechanism. Once acceptance of a particular technology is established within the firewall, it would seem that crossing the firewall would be a more straightforward proposition. Knowing that the protocol is proven may encourage vendors and firewall operators to make appropriate provisions when use cases for the technology appear on the Web at large.

Tier 1: A HTTP profile for High Availability Cluster clients is required

My first Tier 1 issue is the use of HTTP to communicate with High Availability (HA) clusters. In the SCADA world, we typically operate with no single point of failure anywhere in a critical system. We typically have redundant operator workstations, each with redundant Network Interface Cards (NICs), and so on and so forth, all the way to a HA cluster. There are two basic ways to design the network between, either create two separate networks for traffic, or interconnect. One approach yields multiple IP addresses to connect to across the NICs of a particular server, and the other yields a single IP. Likewise, it is possible to perform IP takeover and have either a single IP shared between multiple server hosts or have multiple IPs.

Other than HA, we typically have a constraint on failover time. Typically, any single point of failure is detected in less than five seconds and a small amount of additional time is allocated for the actual recovery. Demands vary, and while some customers will be happy with a ten or thirty second total failover time others will demand a "bumpless" transition. The important thing about this constraint is that it is not simply a matter of a new server being able to accept new requests. Clients of the HA cluster also need to make their transition in the specified bounded time.

HTTP allows for a timeout if a request takes too long, typically around forty seconds. If this value was tuned to the detection time, we could see that our server had failed and attempt to reconnect. However, this would reduce the window in which valid responses must be returned. It would be preferable to send periodic keepalive requests down the same TCP/IP connection as the HTTP request was established on. This keepalive would allow server death detection to be handled independently of a fault that causes the HTTP server not to respond quickly or at all. We are experimenting with configuring TCP/IP keepalives on HTTP connections to achieve HA client behaviour.

The first question in such a system is about when the keepalive should be sent, and when it should be disabled. For HTTP the answer is simple. When a request is outstanding on a connection, keepalives should be sent by a HA client. When no requests are outstanding keepalives should be disabled. In general theory, keepalives need to be sent whenever a client expects responses on the TCP/IP connection they established. This general case affects the pub/sub model that I will describe in the next section. If pub/sub updates can be delivered down a HA client's TCP/IP connection, the client must send keepalives for the duration of its subscriptions. It is the server who must send keepalives if the server connects back to the client to deliver notifications. Such a server would only need to do so while notification requests are outstanding, but would need to persist the subscription in a way that left the client with confidence that the subscription would not be lost.

Connection is also an issue in a high availability environment. A HA client must not try to connect to one IP, then move onto the others after a timeout. It should normally connect to all addresses in parallel, then drop all but the first successful connection. This process should also take place when a failover event occurs.

Tier 1: A Publish/Subscribe mechanism for HTTP resources is required

One of the constants in the ever-changing SCADA world, is that we perform soft real-time monitoring of real-world state. That means that data can change unexpectedly and that we need to propagate that data immediately when we detect the change. A field unit will typically test an input every few milliseconds, and on change will want to notify the central system. Loose coupling will often demand that a pub/sub model be used rather than a push to a set of urls configured in the device.

I have begun drafting a specification that I think will solve most pub/sub problems, with a preliminary name of SENA. It is loosely based on the GENA protocol, but has undergone significant revision to attempt to meet the security constraints of the open Web while also meeting the constraints of a SCADA environment. I would like to continue working on this protocol or a similar protocol, helping it reach a status where it is possible to propose it for general use within enterprise boundaries.

We are extremely sensitive to overload problems in the SCADA world. This leads us to view summarisation as one of the core features of a subscription protocol. We normally view pub/sub as a way to synchronise state between two services. We view the most recent state as the most valuable. If we have to process a number of older messages before we get to the newest value, latency and operator response time both increase. We are also highly concerned with situations permanent or temporary where state changes occur at a rate beyond which the system can adequately deal with. We dismiss with prejudice, any proposal that involves infinite or arbitrary buffering at any point in the system. We also expect a subscription model to be able to make effective use of intermediaries, such as web proxies that may participate in the subscription.

Tier 2: One architectural framework with a spectrum of compatible architectures

I believe that the architectural styles of the Web can be applied to the enterprise. However, local conventions need to be permitted. Special methods, content types, and other mechanisms should all be permitted where required. I anticipate that the boundary between special and general will shift over time, and that the enterprise will act as a proving ground for new features of the wider Web. Once such features are established in the wider Web, I would also expect the tide to flow back into enterprises that are doing the same thing in proprietary ways.

If properly nurtured, I see the enterprise as a nursery for ideas that the Web is less and less able to experiment with itself. I suspect that the bodies that govern the Web should also be involved with ideas that are emerging in the enterprise. These bodies can help those involved with smaller-scale design keep an eye on the bigger picture.

Tier 2: Web Services are too low-level

Web Services are not a good solution space for Web architecture because they attack integration problems at too low a level. It is unlikely that two services independently developed against the WS-* stack will interoperate. That is to say, they will only interoperate if their WSDL files match. HTTP is ironically a higher-level protocol than the protocol that is layered on top of it.

That said, we do not rule out interoperating with such systems if the right WSDL and architectural styles are placed on top of the WS-* stack. We anticipate a "HTTP" WSDL eventually being developed for WS-*, and expect to write a protocol converter back to our internal protocols for systems that implement this WSDL. The sheer weight of expectation behind Web Services suggests that it will be simpler for some organisations to head down this path, than down a path based on HTTP directly.

Tier 2: RDF is part of the problem, not the solution

We view RDF as a non-starter in the machine-to-machine communications space, though we see some promise in ad hoc data integration within limited enterprise environments. Large scale integration based on HTTP relies on clear, well-defined, evolvable document types. While RDF allows XML-like document types to be created, it provides something of an either/or dilemma. Either use arbitrary vocabulary as part of your document, or limit your vocabulary to that of a defined document type.

In the former case you can embed rich information into the document, but unless the machine on the other side expects this information as part of the standard information exchange, it will not be understood. It also increases document complexity by blowing out the number of namespaces in use. In practice it makes more sense to define a single cohesive document type with a single vocabulary that includes all of the information you want to express. However, in this case you are worse off than if you were to start with XML.

You cannot relate a single cohesive RDF vocabulary to any other without complex model-to-model transforms. In short, it is easier to extract information from a single-vocabulary XML document than from a single-vocabulary RDF document. RDF does not appear to solve any part of the system integration problem as we see it. However, again, it may assist in the storage and management of ad hoc data in some enterprises in place of traditional RDBMS technology.

We view the future of the semantic web as the development of specific XML vocabularies that can be aggregated and subclassed. For example, the atom document type can embed the html document type in an aggregation relationship. This is used fo elements such as <title>. The must-ignore semantics of atom also allow sub-classing by adding new elements to atom. The subclassing mechanism can be used to produce new versions of the atom specification that interoperate with old implementations. The mechanism can also be used to produce jargonised forms of atom rather than inventing a whole new vocabulary for a particular problem domain.

We see the development, aggregation, and jargonisation of XML document types as the key mechanisms in the development of the semantic web. The graph-based model used by RDF has currently not demonstrated value in the machine-to-machine data integration space, however higher-level abstractions expressed in XML vocabularies are a proven technology set. We anticipate the formation of communities around particular base document types that work on resolving their jargon conflicts and folding their jargon back into the base document types. We suspect this social mechanism for vocabulary development and evolution will continue to be cancelled out in the RDF space by RDF's reliance URI namespaces for vocabulary and by its overemphasis of the graph model.

Tier 2: MIME types are more effective than URI Namespaces

One the subject of XML, we have some concerns over the current direction in namespaces. The selection of a parser for a document is typically based on its MIME type. Some XML documents will contain sub-documents, however there is no standard way to specify the MIME type of the sub-document. We view MIME as more fully-featured than arbitrary URIs, particularly due to the explicit subclassing mechanism available.

In MIME we can explicitly indicate that a particular document type is based on xml: application/some-type+xml. Importantly, we can continue this explicit sub-typing: application/type2+some-type+xml. We consider this an important mechanism in the evolution of content types, especially when jargonised documents are passed to standard processors. It is normal to expect that the standard processor would ignore any jargon and extract the information available to it as part of standard vocabulary.

While MIME also has its weaknesses, the explicit subclassing mechanism is not available in URI name-spaces at all. To use the atom example, again, atom has a application/atom+xml MIME type but an XML namespace of <https://www.w3.org/2005/Atom>. We view the former as more useful than the latter in the development of the Semantic Web and in general machine to machine integration problems.

Tier 2: Digital Signatures are likely to be useful

We regard the protection of secret data by IP-level or socket-level security measures as being sufficient at this time. Secret data is known and communicated by few components of the architecture, so is usually not a scalability issue. We do not think that secret data should have significant impact on Web architecture, however, we do view the ability to digitally sign non-secret data as a likely enabler for future protocol features.

Conclusion

Web technology and architectural style are proven useful tools for systems integration, but are incomplete. A scalable summarising Publish/Subscribe mechanism is an essential addition to the suite of tools, as is a client profile for operating in High Availability environments. These tools must be defined and standardised in order to gain a wide participation to be useful to the enterprise.

We have concerns about some current trends in Web Architecture. These relate to to namespaces in XML, Web Services, and RDF. All of these trends appear to work against the goal of building integrated architectures from multi-vendor components. Our goal outcomes would also appear to be the goal outcomes of the Semantic Web, so we have some hope that these trends will begin to reverse in the future.

Sun, 2007-Feb-18

REST in short form

I have been working on a restwiki article called REST in Plain English, inspired by a conversation on rest-discuss some time ago. It is still a work in progress, but you might get some mileage out of it. An executive summary is this:

is an architecture that attempts to enforce as few constraints on developers as possible. While that is all well and good in small well-controlled environments, it doesn't scale up. Unconstrained architecture is another way of saying "none of the pieces can talk to each other without prior planning". constrains the architecture down to a set of uniform interactions using uniform document types. Whenever two components of the architecture support the same interaction pattern (GET, PUT, POST, DELETE) and the same document type (html, atom, plain text) they can be configured to communicate without prior planning and without writing new code.

REST increases the likelihood that arbitrary components of the architecture can talk to each other, but also addresses issues of how the architecture can evolve over decades or more of changing demands and how parts of the architecture can scale to huge sizes. It allows for horizontal scalability by limiting the amount of state different cluster members should share. It allows for vertical scalability by layering caches and other intermediataries between clients and servers. It even scales socially, allowing a huge number of both client- and server- side implementations of its protocols to work together.

A the same time, SOA is trying to solve non-web problems. It is trying to solve problems of a single business or a pair of businesses communicating. It is trying to deal with special problems and special use cases. I think that we are on the verge of seeing the architecture of the web combined with the WS-* understanding of enterprise problems. I think we will see a unified architecture that easily scales between these extremes.

Are the IETF and the w3c still the right forums to solve our special problems, or do we need industry and other special interest groups to figure out what best practice is? Once the practice is established these groups can come back and see if their solutions can be applied to the broader Web. I think the spectrum between pure constrained REST and unconstrained enterprise computing needs some shaking up at both ends. I'm happy to see others excited about the possibilities ahead, too.

Benjamin

Tue, 2007-Feb-06

Software Factories - Raising the Level of Abstraction

Today's take-homes:

I have been reading a book during recent business trips to Melbourne called . It is written by Jack Greenfield, Keith Short, Steve Cook, and Stuart Kent. It can be found as ISBN 0-471-20284-3. I borrowed my copy from WRSA V&V heavy, Brenton Atchison.

The main premise of the book is that we need to be developing better, more reliable, and more industrial software through reuse. It notes the failure to date of Object-Oriented approaches to reuse, and attempts to forumlate a path out of the wilderness based on domain specific languages and modelling techniques. It uses a quote from Michael Jackson:

Because we don't talk about problems, we don't analyze or classify them, and we slip into the childish belief that there can be universal development methods, suitable for solving all development problems.

In its chapter on "Dealing with Complexity" this book nails a design principle I had so far never quite expressed clearly. It talks about refinement and abstraction, implementation and requirements as part of a continuum. If you start at the top of a development with a set of requirements and end up with an implementation, the difference between these two specifications can be called an abstraction gap. If the requirements were complete and consistent, why can't they be executed? Simply because they are not code?

Consider that what we think of as code today is not what executes on our physical machines. Software Factories suggests that we should think of our code as as a specification for a compiler. The compiler automatically constructs machine code from that specification making a number of design decisions about how to optimise for space and time along the way. It also transforms our input to improve its efficiency. In other words, it is an automated way of crossing the abstraction gap between our code and the machine's code.

This sets out a general principle for good design, whether the design be encapsulated in a or a General Purpose Language: The purpose of design is to provide language constructs, classes, and other features that lift the level of abstraction from the basic language and library you start with to specific concepts in the requirements domain. The closer you can get to the concepts in the requirements domain, the better.

I happened to overhear a conversation between two co-workers early on in the SystematICS software development. One was trying to explain the difference between a design I had proposed and the implementation the other was trying to write. They said that when I talked about a particular concept being in the code I meant that it was a literal class, not a fuzzy concept held across several classes. It is important in any kind of design to ensure that the constructs you define map directly to concepts in the requirements specification, or enable those direct mappings to be made in other constructs.

Benjamin

Sat, 2007-Feb-03

Copyright and Orphan Works

writes an interesting article on what he thinks should happen to US copyright law to deal with .

The requirement it imposes after the 14/5 year delay is registration... like a DNS for copyright... Any work subject to the OWMR and failing to register within the proper period shall:

  • [ALTERNATIVE 1]: lose copyright protection
  • [ALTERNATIVE 2]: have its copyright remedies curtailed.

The effect of the proposal is that any copyright holder who fails to register their new work gets fourteen years copyright before their work effectively falls into the public domain. If they are still making a buck from the content or care for some other reason, registration grants them either the full copyright period or copyright protection until they fall off the registry.

I know that Lessig is something of a free culture extremest, but this proposal is interesting in how it relates to property-based views of copyright in modern culture.

I think the general feeling of people today is that if someone created a work they have the right to control that work. That control could be limited to what is required to make a dollar, but stronger control is now often accepted through use of content licensing. combined with the is an extreme end of this, where a copyright owner can curtail even fair use rights just by stating in a software contract that they want the content to be used in a particular way.

Whether you take a free culture or non-free culture viewpoint I think Lessig's approach makes sense. Orphaned content reverts to the copyright periods that were envisaged when copyright law was first written. Content that still matters to the author for financial or non-financial reasons gets a copyright period consistent with the kind of investement/return ratio that Disney expect from their creations.

I'm not sure exactly how this proposal would deal with DRM per se. Orphaned DRM content that isn't registered will still not be available for use unless DRM-cracking technology is available and permitted by applicable law in this case. In the era when big content producers are increasingly tightening controls over all produced content using DRM technologies this isn't really an issue that can be ignored.

Benjamin

Sat, 2007-Jan-27

RDF and the Semantic Web - Are we there, yet?

RDF is supposed to be the basis of the , but what is the semantic web and does RDF help realise the semweb vision? I will frame this discussion in terms of the capabilities and , as well involving the architectural style.

The Semantic Web

Tim Bernes-Lee writes:

The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help... [T]he Semantic Web approach... develops languages for expressing information in a machine processable form.

The goal of the semantic web can therefore be phrased as applying REST practice to machines. On the face of it the semantic web seems like a tautology. Machines already exchange semantics using the REST architectural style. They exhange HTML documents that contain machine readable paragraph markers, headings, and the like. They exchange Atom documents that contain update times, entry titles, and author fields. They exchange vcalendar documents that convey time and date information suitable for setting up meetings between individuals. They even exchange vcard documents that allow address book entries to be transferred from one machine to another.

So the question is not whether or not we are equipped to transfer machine-readable semantics, but why the semantics are so low level and whether or not RDF can help us store and exchange information compared to today's leading encoding approach: XML.

The Fragmented Web

I would start out by arguing that machine to machine communication is hard. The reason it is hard is not because of the machines, but because of the people who write the software for those machines. Ultimately, every succesful information transfer involves agreement between the content producer and content consumer. This agreement covers encoding format, but more that that. It covers vocabulary. The producer and consumer have either directly agreed on or dictated the meaning of their shared document, or have implemented a shared standard agreed by others. Each agreement exists within some soft of sub-culture that is participating in the larger architecture of the web.

Even if I use some sort of transformation layer that massages your data into a form I am more comfortable working with, I still must understand your data. I must agree with you as to its meaning in order to accept your input for processing. Transformations are more costly than bare agreement because an agreement is still required to feed into the transformation process.

REST views the web architecture in terms of universal document types that are transferred around using universally-understood methods and a universally-understood identifier scheme. The document type needs to indicate any particular encoding that is used, but also the vocabulary that is in use. In other words, REST assumes that a limited number of vocabularies plus their encoding into documents will exist in any architecture. Certainly far fewer vocabularies than there are participants in the architecture.

I'll continue with the theme from my last article, that in practice we don't have a single universal architecture. What we have is a rough universal main architecture that is broken down along human sub-culture boundaries into sub-architectures. These sub-architectures will each have their own local concepts, conventions, and jargon. In this environment we can guage the effectiveness of an encoding or modelling approach for data by how well it bridges divides beween main and sub- architectures. Do whole new languages have to be introduced to cope with local concepts, or can a few words of jargon mixed into a broader vocabulary solve the problem?

The eXtensible Markup Language

First, let's look at XML. XML is a great way to encode information into a document of a defined type. It has proven useful for an enormous number of ad hoc document types or document types that have non-universal scope. It is also making a move into the universal world with document types such as atom and xhtml in the mix.

The dual reasons for the success of XML are that it is easy to encode most information into it, and it is easy to work with the information once encoded. The transformation tools such as xslt or pure dom manipulation are good. It is easy to encode information from arbitrary program data structures or database tables, and easy to decode into the same. It imposes low overheads for correctness, demonstrates good properties for evolution, and is basically understood by everyone who is likely to care.

XML has the ability to evolve when its consumers ignore parts of the document they don't understand. This allows producers and consumers of new versions of the document type to interoperate with producers and consumers of the old document type. More generally, XML is good at subclassing document types. A document with extra elements or attributes can be processed as if it did not have those extensions. This corresponds to the ability in an object-oriented language to operate through a base-class or interface-class instead of the specific named class.

Subclassing is not the only way that XML can accomodate changes. An XML docunent can be made to include other XML documents in a form of aggregation. For example, we have the atom specification refering to xhtml for its definition of title and content elements. This is similar to an object-oriented language allowing public member variables to be included in an object.

The Resource Description Framework

As XML can do subclassing and aggregation it makes sense to view it as a practical way to encode complex data in ways that will be durable. However RDF challenges this view from a database-oriented viewpoint. It says that we should be able to arbitrarily combine information, and extract it from a given document using an SQL-like query mechanism. We should be combine information from different documents and vocabularies for use in these queries. This creates hybrid documents that could conceivably be used to combine information from different sub-architectures. By providing a common conceptual model for all information RDF hopes that the vocabularies will sort themselves out within its global context.

Personally, I wonder about all that. Whenever you mix vocabularies you incur a cost in terms of additional namespaces. It's like having a conversation where instead of saying, "I'm going to the shops, then out to a caffe". you say: "I'm old-english:gAn old-english:tO old-english:thE old-english:sceoppa, old-english:thonne old-english:ut old-english:tO a italian:caffe". Just where did that term you are using come from again? Is caffe Italian or French? Imagine if today's html carried namespaces like "microsoft:" and "netscape:" throughout. Namespaces to identify concepts do not handle cultural shifts very well. In the end we just want to have a conversation about going to the shops. We want to do it in today's language using today's tools. We don't want a history lesson. Supporting these different namespaces may even help us avoid coming to proper consensus between parties, fragmenting the vocabulary space unnecessarily.

The main thing RDF does practically today is allow data from different sources to be placed in a database that is agnostic as to the meaning of its data. Queries that have knowledge of specific vocabularies can be executed to extract information from this aggregated data set. So far this class of application has not proven to be a significant use case on the web, but has made some inroads into traditional database territory where a more ad hoc approach is desired.

Conclusion

So it seems to depend on your view of information transfer as to whether XML or RDF currently makes more sense. If you see one machine sending another machine a document for immediate processing, you will likely prefer XML. It is easy to encode information into XML and extract it back out of the document. If you see the exchange as involving a database that can be later queried, RDF would seem to be the front-runner. RDF makes this database possible at the cost of making the pure information exchange more complex.

In terms of how the two approaches support a architecture built up of sub-architectures, well.. I'm not sure. XML would seem to offer all of the flexibility necessary. I can subclass the iCalendar-as-xml type and add information for scheduling passenger information displays on a train platform. I can include xhtml content for display. It would seem that I can introduce my local jargon at a fairly low cost, although it may be advisible to use a mime type that clearly separates the PIDS use from other subclasses. That mime type would ideally include the name of the type it derives from so that it can be recogised as that type as well as the subclass type: application/pids+calendar+xml.

RDF also allows me to perform subclassing and aggregation, and even include XML data as the object of a triple. In RDF I would be required to come up with a new namespace for my extensions, something that is not particularly appealing. However extra functionality is there if you are willing to pay for the extra complexity.

Benjamin

Sat, 2007-Jan-20

Breaking Down Barriers to Communication

When the cut-and-paste paradigm was introduced to the desktop, it was revolutionary. Applications that had no defined means of exchanging data suddenly could. A user cuts or copies data from one application, and pastes it into another. Instead of focusing on new baseclasses or IDL files in order to make communication work, the paradigm broke the comunication problem into three separate domains: Identification, Methods, and Document Types. A single mechanism for identification for a cut or paste point combined with a common set of methods and document types allow ad hoc communication to occur. So why isn't all application collaboration as easy and ad hoc cut-and-paste?

The Importance of Architectural Style

The constraints of REST and of the Cut-and-paste paradigm contain significant overlap. REST also breaks communication down into a single identification scheme for the architecture, a common set of methods for the architecture, and a common set of document types that can be exchanged as part of method invocation. The division is designed to allow an architecture to evolve. It is nigh impossible to change the identification scheme of an architecture, though the addition of new identifiers is an every day occurance. The set of methods rarely change because of the impact this change would have on all components in the architecture. The most commonly-evolving component is the document type, because new kinds of information and new ways of transforming this information to data are created all of the time.

The web is to a significant extent an example of the application of REST principles. It is for this reason that I can perform ad hoc integration between my web browser and a host of applications across thousands of Internet servers. It is comparible to the ease of cut-and-paste, and the antithesis of systems that focus on the creation of new baseclasses to exchange new information. Each new baseclass is in reality a new protocol. Two machines that share common concepts cannot communicate at all if their baseclasses don't match exactly.

The Importance of Agreement

Lee Feigenbaum writes about the weakness of REST:

This means that my client-s[i]de code cannot integrate data from multiple endpoints across the Web unless those endpoints also agree on the domain model (or unless I write client code to parse and interpret the models returned by every endpoint I'm interested in).

Unfortunately, to do large scale information integration you have to have common agreed ways of representing that information as data. This includes mapping to a particular kind of encoding, but more than that. It requires common vocabulary with common understanding of the semantics associated with the vocabulary. In short, every machine-to-machine information exchange relies on humans agreeing on the meaning of the data they exchange. Machines cannot negotiate or understand data. They just know what to do with it. A human told them that, and made the decision as to what to do with the data based on human-level intelligence and agreement.

Every time two programs exchange information there is a human chain from the authors of those programs to each other. Perhaps they agreed on the protocol directly. Perhaps a standards committee agreed, and both human parties in the communication read and followed those standards. Either way, humans have to understand and agree on the meaning of data in order for information to be successfully encoded and extracted.

Constraining the Number of Agreements

In a purely RESTful architecture we constrain the number of document types. This directly implies a constraint on the number of agreements in the architecture to a number that grows more slowly than the number of components participating in the architecture. If we look at the temporal scale we constrain the number of agreements to grow less rapidly than the progress of time. If we can't achieve this we won't be able to understand the documents of the previous generation of humanity, a potential disaster. But is constraining the number of agreements practical?

On the face of it, I suspect not. Everywhere there is a subculture of people operating within an architecture there will be local conventions, extensions, and vocabulary. This is often necessary because concepts that are understood within the context of a subculture may not translate to other subcultures. They may be local rather than universal concepts. This suggests that what we will actually have over any significant scale of architecture is a kind of main body which houses universal concepts above an increasingly fragmented set of sub-architectures. Within each sub-architecture we may be able to ensure that REST principles hold.

Solving the Fragmentation Problem

This leaves us, I think, with two outs: One is to accept the human fragmentation intrinsic to a large architecture, and look for ways to make the sub-architectures work with wider architectures. The other is to forget direct machine to machine communications, involving humans in the loop.

We do both already on the web in a number of ways. In HTML we limit the number of universal concepts such as "paragraph" and "heading 3", but allow domain-specific information to be encoded into class attributes, and allow even more specific semantics to be conveyed in plain text. The class attributes need to work with the local conventions of a web site, but could convey semantics to particular subcultures as microformat-like specifications. The human-readable text conveys no information to a machine, but by adding human-level intelligence a person who is connected to the subculture the text came from can provide an ad hoc interpretation of the data into information.

We see this on the data submission side of things too. We see protocols such as atompub conveying semantics via agreement, but we also have HTML forms which can perform ad hoc information submission when a human is in the loop. The human uses their cultural ties to interpret the source doucument and fill it out for submission back to the server.

Conclusion

I don't think that either or the can ignore the two ends of architectural picture fragmented by human subcultures. Without universal concepts that have standard encodings and vocubulary to convey them we can't perform broad scale information integration across the architecture. Without the freedom to perform ad hoc agreement the architecture opens itself up to competition. Without a bridge between these two extremes the vocabulary that should simply be a few local jargon expressions thrown into a widely-understood conversation will become their own languages that only the locals understand. The RDF propensity to talk about mapping between vocabularies is itself a barrier to communication. It will always be cheaper to have a conversation when a translator is not required between the parties for concepts both parties understand.

Benjamin

Sat, 2007-Jan-06

Death and Libraries

Have you ever wondered what will happen to your when you ? Perhaps it is the influence of parenthood on my life, but I have been thinking about the topic of late. If a part of your legacy is in your blog, what will your legacy be? Perhaps have a role in guaranteeing the future of today's web.

I suspect that most bloggers haven't really thought about the problem. How long will your blog or web site last? Only as long as the money does. The monthly internet bill needs paying, or the annual web hosting fee if your hosting occurs externally. If you have that covered your domain registration will be up for renewal in less than two years. Perhaps you don't have a vanity domain. Maybe you are registered with blogger. This kind of blog is likely to last a lot longer, but for how long? Will your great grandchildren be able to read your blog? Will their great grandchildren? Will your great great great granchildren be able to leave new comments on your old material?

Blogs are collections of resources. Resources demarcate state, and return representations of that state. These representations are documents in particular formats, such as HTML4. So in addition to the question of whether the resources themselves will be durable we must consider how durable the document formats used will be. We may even have to look at whether HTTP/1.1 and TCP/IPv4 will be widely understood a hundred years from now.

The traditional way to deal with these sorts of longevity problems is to produce hard copies of the data. You could print off a run of 1000 bound copies of your blog to be distributed amongst interested parties. These parties might be your descendants, historians who think you have some special merit in the annuls of mankind, and perhaps most universally: Librarians who wish to maintain a collection of past and present thought.

We could attempt the same thing with the web, however the web maps poorly to the printed word given the difficulty of providing appropriate hyperlinks. It also rests on the notion that the person interested in a particular work is geographically close to the place that it is housed, and can find it through appropriate means. Let us consider another possibility in the future networked world. Consider the possibility that those with an interest in the works host the works from their own servers.

Consider the cost of running a small library today. If all data housed in the library eventually became digital data, that data could be distributed anywhere in the world for a fraction of the cost of running a library today. We already see sites like the wayback machine attempting to record the web of yesteryear, or google cache trying to ensure that today's content is reliably available. Perhaps the next logical step is for organisations to start hosting the resources of the original site directly. After all, there is often as much value in the links between resources as there are in the resource content itself. Maintaining the original urls is important. Perhaps web sites could be handed over to these kinds of institutions to avoid falling off the net. Perhaps these institutions could work to ensure the long survival of the resources.

The technical challenges of long-term data hosting are non-trivial. A typical web application consists of application-specific state, some site-specific code such as a PHP application, a web server application, an operating system, physical hardware, and a connection to an ISP. Just to start hosting the site would likely require a normalisation of software and hardware. Perhaps an application that simply stores the representations of each resource and returns them to its clients could replace most of the software stack. The connection to the ISP is likely to be different, and will have to change over time. The application protocols will change over the years as IPv6 replaces IPv4 and WAKA replaces HTTP (well, maybe). The data will have to hop from hardware platform to hardware platform to ensure ongoing connectivity, and from software version to software version.

If all of this goes to plan your documents will still be network accessible long after your bones have turned to dust. However this still assumes the data formats of today can be understood or at worst translated into a form that is equivalent into the future. I suggest that we have already travelled a few decades with HTML, and that we will travel well for another few decades. We can still read the oldest documents on the web. With good standards management it is likely this will still be the case in 100 years. Whether the document paradigm that HTML sits in will still exist in 100 years is another question. We may find that these flat documents have to be mapped into some sort of immersive virtual envrionment in that time. The librarians will have to keep up to date with these trends to ensure ongong viability of the content.

I see the role of librarian and of system administrator as becoming more entwined in the future. I see the librarian as a custodian for information that would otherwise be lost. Will today's libraries have the foresight to set the necessary wheels in motion? How much information will be lost before someone steps in and takes over the registration and service of discontinued domains?

Benjamin