Sound advice

Thu, 2007-Aug-30

URI vs Resource

The concept of a resource is central to REST theory, but when talking about REST I rarely mention it. This may look like an oversight or a mistake, but it is deliberate. Let's walk through a sample conversation with a new developer about REST:

Developer: So tell me about this "REST" thing
REST Guy: Oh, it's great. It's so simple. It stands for representational state transfer. It's an architectural style with a bunch of constraints: Client/Server, Stateless between requests, Explicit cache control, Uniform Interface, Layering, Identification of resources, Manipulation via representations, Self-descriptive messages, Hypermedia as the engine of application state, and optional code on demand.
Developer: Wha wha wha?
REST Guy: Oh, the central tenant is resources. They're whatever you want them to be. They're only accessable via their URI and exchanged representations...
Developer: What on earth is a representation? Hang, on... I think I know what a URL is.. but...
REST Guy: But there's this whole question about whether resources with different URLs can be the "same" resource. Then there is the issue of whether you can have different representations of the same resource at different URLs, say the RSS and ATOM versions of a news feed.
Developer: OK. I'm going off to write some WSDL, now.

That conversation and the jargon around REST is a high barrier to entry that may be useful to computer scientists, but generally don't help architects or developers very much. In fact, I think the whole Resource/URL/Representation issue is an incredible waste of time and bandwidth for all concerned. The vague notion of a resource independent of identification and representation frankly doesn't add anything to our understanding of architecture.

Here is the conversation I try to have with developers or junior architects:

Developer: So tell me about this "REST" thing
REST Guy: It's great. You do what you would do in your existing SOA world, except you modify the interface a bit. Each time you define a new WSDL you make your app incompatible with other apps, but if you break up that WSDL interface into smaller objects with a preexisting interface other apps can more easily interact with yours.
Developer: Wha wha wha?
REST Guy: Just like multiple methods can modify the same data, operations on different URLs can retrieve and modify overlapping data. These operations should normally be either GET or PUT, but the important thing is that there is an architecture-wide catalogue of these operations and the data formats they transfer. We call REST "REST" for "Representational State Transfer". It's just jargon for "transferring information as well-known data types using retrieval and update operations". There are a few more constraints, but for the most part they are encapsulated by restricting yourself to GET and PUT.
Developer: What on earth is a representation? Hang, on... I think I know what a URL is.. but...
REST Guy: So you get awesome interoperability. The methods and data format are all controlled centrally, so two applications that share the same data model will very likely end up using the same data formats and methods. The result? Applications work together unexpectedly. Not only that, but you get awesome evolveability. You can deploy an app today, and without ever upgrading it you can keep it working with other applications. You can dynamically modify which URLs it interacts with using redirection. You can deal with its data formats becoming superseded through content negotiation. Method evolution is much harder, but methods are more stable and changes to the set of methods usually take backwards-compatibility into account. You get awesome visibility and control over the communications on the wire, awesome performance optimisation through caching, a great common vocabulary for talking about interfaces with other developers and architects... man... it's all upside.
Developer: OK. I'm going off to write some WSDL, now.

While you may not convince your co-workers on the first day, you need to at least meet them where they are today. You have to be able to talk about how REST affects the way they do things, now. One thing your co-workers don't care about and will likely never care about is the abstract notion of what a resource is. If they understand what a URL is, and they understand the interaction through standard data formats that REST affords there is really nothing more to understand.

Can the same resource be available at two resources? No. There is no point distinguishing between a resource and the URL it is provided at. If you have two URLs you have two resources. These resources may be equivalent for some specific purposes, but that is not to say they are the same. If their identifiers are different, they are not the same. Can representations of the same resource be available under multiple URLs? No. They are different URLs, so they are different resources that relate to the same underlying application data.

Benjamin

in links: google google blogsearch technorati delicious
[/general] permanent link

Sat, 2007-Aug-25

A Services View for REST Architectures

In my last article I wrote about the importance of views in communicating architectural information to specific stakeholders. I have struggled in trying to literally apply the 4+1 model to REST architecture, but I don't see REST as the problem. I think the same issues come up when talking about Service-Oriented Architecture. I would phrase the issue as follows:

I want to draw a object-style diagram that lists the services running in my architecture, and their clients. I want to identify all of the URLs provided by these services (in URL template form). I want to know who is providing them, and who is using them. I want to know what methods are available on each URL (GET, and/or PUT, and/or DELETE). I want to know which content types are supported by each URL.

This architectural view is the distributed software architectural of a wiring diagram. It allows me to quickly analyse whether a particular service is getting enough information to meet its requirements. It allows me to put off thinking about exactly how the services will be deployed or laid out in my source-code repositories. It lets me concentrate on the bigger picture.

So, this isn't the deployment (physical) view. I am not laying out the services on physical machines. It isn't the development view. I'm not thinking about library dependency structure. So, is it the logical view or the process view? Phillipe says that the logical view is like a class diagram, so that might be right. However, the process view is supposed to show which parts of the architecture work in parallel to each other, and how they interact. That also sounds familiar. I am not necessarily thinking about how many levels of redundancy I'll provide when I cluster a particular service. Phillipe says I should be showing that in the process view. However, the logical view is supposed to be customer-oriented: A functional breakdown. I'm not sure the services view is always going to meet that goal.

My approach at the moment is to treat the logical view as part of or as an extension of the requirements specification. It groups functions in a way that makes sense to the customer. The services view most closely matches up to the process view, so while I hesitate to actually call it a process view it occupies that spot on the classic 4+1 diagram. Don't get me started as to what should appear in the "+1" scenarios view.

The services view consists of a object per service, client, URL Template, and content type. Each URL Template has exactly one aggregation relationship, linking to a particular service. Clients and services may have dependency relationships on URL templates, and we would expect each URL to have at least one dependent object. Each URL template has relevant GET, PUT, and DELETE methods as explicit UML operations with one specific content type parameter for PUT and a specific return content type for each GET. GET and PUT appear as many times as necessary to cover content negotiation supported by the URLs. Normally this means at most one PUT and at most one GET. Other supported content negotiation (eg language) could be incorporated in the same way. The Uniform Interface does not appear explicitly in the model, but can be inferred from the total collection of content types and URL methods.

Building this into a UML model allows me to run various validation checks to make sure architectural constraints I care about are enforced. It also allows easy modification as requirements change or problems are discovered. Non-REST services can be incorporated in similar way, with URLs that have less standard methods. Ad hoc protocols can be also incorporated in to the diagram.

I find this view to be a useful tool in my arsenal, and explaining REST to my developers is somewhat of a non-issue. I am the architect, so obviously I need to know what I am doing. After that, all I have to do is win enough arguments and review enough documents to ensure my specification evolves sensibly and is followed.

Benjamin

in links: google google blogsearch technorati delicious
[/general] permanent link

Sat, 2007-Aug-25

Understanding REST

Charles Savage and Alex Bunardzic are both talking about how REST (or ROA) seems hard to understand. Charles has been talking about how we are not looking at the big picture. Alex talks about the importance and subtlety in distinguishing between resources and URLs.

I think the problem is rooted firmly in the domain of software architecture. Software architecture is something that most software developers fail at, and as an industry we are only just starting to pick apart in useful ways. The principal finding of IEEE 1471 "Recommended Practice for Architectural Description of Software-Intensive Systems" is that architecture has different stakeholders, an that each stakeholder needs to see a limited subset of the overall set of information to make appropriate decisions. What IEEE 1471 doesn't dictate is who those stakeholders will be, and what they need to know.

Roy's thesis is REST explained to computer scientists. Its principle function is to distinguish the REST style from alternative styles. Its secondary function is to derive properties of REST architecture. Its stakeholders are academics, not architects or software developers. Roy calls it an architectural style. It may be more appropriate to call it an architectural view; one that can be shared between numerous actual architectures.

Roy's view hides information about RESTful architectures not relevant to his stakeholders. It doesn't cover a 4+1-style process view that would describe the exact interactions between services and their clients. It doesn't contain a deployment view that demonstrates how services are distributed across physical machines. It doesn't contain a development view that describes how written software is laid out in directory structures under various source-code repositories. It doesn't document the APIs that developers should write their services against.

In short, there are a lot of gaps to fill in. There are a lot more views to populate before you can hand over a specification document and ask your team to go write some code. There are a lot of ways of writing those views, both conforming to and conflicting with Roy's base view... and a lot of communities and thought leaders to turn in the same direction before you have any sort of wide-spread acceptance and understanding of a common way forward.

The strategy of individuals and groups who have an interest in turning those ships should be to meet them where they are now, to solve real problems, and to fill out those additional architectural views as appropriate for those communities.

Benjamin

in links: google google blogsearch technorati delicious
[/general] permanent link

Sat, 2007-Aug-04

Reliable POST (reliable remote state creation)

Stefan Tilkov points to a recent work by Joe Gregorio: RESTify Day Trader

He highlight's Joe's suggestion to use PUT to reliably create server-side state. In particular, to lodge a purchase order. I have been suggesting this kind of approach for a while, but I have issues with Joe's specifics. I am particularly focused on how automated REST clients and services interact, so I take things a little beyond whatever works in the browser. I have issues with Joe's POST precursor and 303 response.

I'm not sure the POST used by Joe is entirely necessary. PUT is allowed to create the resource at the request URI, so creating it using a POST in an earlier request is not strictly required. The only thing necessary is for the client to discover or construct the url of the order that it is about to submit. I suggest either a non-cacheable GET that returns a different URL each time, or construction of the url with a client-supplied guid will be appropriate. Which is the most appropriate will depend on the exact situation.

Preferred approach:

>> GET https://example.com/orderForm
(repeat if necessary)
<< 200 OK, form with https://example.com/order/1000 as submit element
   (non-cacheable, with a different submit url for each request)
>> PUT https://example.com/order/1000 (data)
(repeat if necessary)

Fallback approach:

>> PUT https://example.com/orders?guid=client-supplied-guid (data)
(repeat if necessary)

I also wonder about the use of 303. rfc2616 is clear:

"The new URI is not a substitute reference for the originally requested resource"

It isn't appropriate for a client to take this url and use it as the url of its order. The most it should do is retrieve the response to its PUT request using an additional GET request.

There is no need to "move" the resource to a new URL. The URL is its identity. This identity should not include the state of the order, and the identity of this resource should not change. The URL that the client issued the PUT to should be the one it continues to interact with for GET or PUT transactions with the order.

The move also puts stress on PUT's idempotency: If the client issues the request again after the move has occured, is it going to issue an additional order? Will it get a 410 Gone response? The former is a bug, while the latter is difficult to interpret. Did the server reject the request, or was it successful? 410 is unavoidable if the resource is DELETED before the client finishes repeating its requests, however in my view it should be an exceptional rather than a common case.

Benjamin.

in links: google google blogsearch technorati delicious
[/general] permanent link

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My recent bookmarks

URI vs Resource

A Services View for REST Architectures

Understanding REST

Reliable POST (reliable remote state creation)

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My current feeds

My recent bookmarks

URI vs Resource

A Services View for REST Architectures

Understanding REST

Reliable POST (reliable remote state creation)