So we have our REST triangle of nouns, verbs, and content types. REST is
tipping us towards placing site to site and object to object variation in our
nouns.
Verbs and content types should be "standard", which means that they shouldn't
vary needlessly but that we can support some reasonable levels of variation.
Verbs
If it were only the client and server involved in any exchange, REST verbs
could be whittled down to a single "DoIt" operation. Differences between GET,
PUT, POST, DELETE, COPY, LOCK or any of the verbs which HTTP in its various
forms supports today could be managed in the noun-space instead of the verb
space. After all, it's just as easy to create a
https://example.com/object/resource/GET resource
as it is to create
https://example.com/object/resource
with a GET verb on it. The server implementation is not going to be overly
complicated by either implementation. Likewise,
it should be just as easy to supply two hyperlinks to the client as it is to
provide a single
hyperlink with two verbs. Current HTTP "A" tags are unable to specify which
verb to use in a transaction with the href resource.
That has lead to tool providers misusing the GET verb to perform user actions.
Instead of creating a whole html form, they supply a simple hyperlink. This
of course breaks the web, but why is not as straightforward as you may think.
Verbs vs Delegates
Delegates in C# and functions in python give away how useful a single "doIt"
verb approach is. In
a typical O-O observer pattern you need the observer to inherit from or
otherwise match the specification available for a baseclass. When the subject
of the pattern changes it looks through its list of observer objects and calls
the same function on each one. It quickly becomes clear when we use this
pattern that the one function may have to deal with several different scenareos.
One observer may be watching several subjects, and it may be important to
disambiguate between them. It may be important to name the function in a more
observer-centric rather than subject-centric way. Rather than just "changed",
the observer might want to call the method "openPopupWindow". Java tries
to support this flexibility by making it easy to create inner classes which
themselves inherit from Observer and call back your "real" object with the
most appropriate function. C# and python don't bother with any of the baseclass
nonsense (and the number of keystrokes required to implement them) and supply
delegates and callable objects instead. Although Java's way allows for multiple
verbs
to be associated with each inner object, delegates are more "fun" to work with.
Delegates are effectively hyperlinks provided by the observer to the subject
that should be followed on change, issuing a "doIt" call on the observer object.
Because we're now hyperlinking rather than trying to conceptualise a type
hierarchy things turn out to be both simpler and more flexible.
The purpose of verbs
So if not for the server's benefit, and not for the client's benefit, why
do we have all of these verbs? The answer for the web of today is caching, but
the reasoning can be applied to any intermediatary. When a user does a GET, the
cache saves its result away. Other verbs either mark that cache entry dirty or
may update the entry in some ways. The cache is a third party to the
conversation and should not be required to understand it in too much detail,
so we expose the facets of the conversation that are important to the cache
as verbs. This principle could apply any time we have a third party involved
who's role is to manage the communication efficiently rather than to become
involved in it directly.
Server Naivety and Client Omniscience
In a client/server relationship the server can be as naive as it likes.
So long as it maintains the basic service contstraints it is designed for, it
doesn't care whether operations succeed or fail. It isn't responsible for
making the system work. Clients are the ones who do that. Clients follow
hyperlinks to their servers, and they do so for a reason. Whenever a client
makes a request it already knows the effect its operation should have and
what it plans to do with the returned content. To the extent
necessary to do its job, the client already knows what kind of document will
be returned to it.
A web browser doesn't know which content type it will receive. It may be
HTML, or some form of XML, or a JPEG image. It could be anything within reason,
and within reason is a precisely definable term in this context. The web
browser expects a document that can be presented to its user in a
human-readable form, and one that corresponds to one of the standard content
types it supports for this purpose. If we take this view of how data is handled
and transfer it into a financial setting where only machines are involved, it
might read like this: "An account reconciler doesn't know which content type it
will receive. It may be ebXML, or some form from of OFX, or an XBRL report.
It could be anything with reason, and within reason is a precisely definable
term in this content. The reconciler expects a document that can be used to
compare its own records to that of a supplier or customer and highlight any
discrepencies. The document's content type must correspond to one of the
standard content types it supports for this purpose."
REST allows for variations in content type, so long as the client
understands how to extract the data out of the returned document and transform
it into its own internal representation. Each form must carry sufficient
information to construct this representation, or it is not useful for the
task and the client must report an error. Different clients may have different
internal representations, and the content types must reflect those differences.
HTTP supports negotiation of content types to allow for clients with differing
supported sets, but when new content types are required to handle different
internal data models it is typically time to introduce a new noun as well.
Hyperlinking
So how does the client become this all knowing entity it must be in every
transaction it participates? Firstly, it must be configured with a set of
starting points or allow them to be entered at runtime. In some applications
this may be completely sufficient, and
the configuration of the client could refer to all URIs it will ever have to
deal with. If that is not the case, it must use its configured and entered URIs
to learn
more about the world.
The HTML case is simple because its use cases are simple. It has two basic
forms of hyperlink: the "A" and the "IMG" tags. When it comes across an "A"
it knows that whenever that hyperlink is activated it should look for a
whole document to present to its user in a human-readable form. It should
replace any current document on the screen. When it comes across "IMG" it knows
to go looking for something humean-readable (probably an actual image) and
embed it into the content of the document it is currently rendering. It doesn't
have to be any more intelligent than that, because that is all the web browser
needs to know to get its job done.
More sophisicated processes require more sophisticated hyperlinks. If
they're not configured into the program, it must learn about them. You could
look at this from one of two perspectives. Either you are extending the
configuration of your client by telling it where to look to find further
information, or the configuration itself is just another link document.
Hyperlinks may be picked up indirectly as well, as the result of POST
operations which return "303 See Other". As the omniscient client they must
already know what to do when they see this response, just as a web browser
knows to chase down that Location: URI and present its content to the user.
There is a danger in all things of introducing needless complexity. We can
create new content types until we're blue in the face, but when it comes down
to it we need to understand the clients requirements and internal data models.
We must convey as much information as our clients require, and have some faith
that they know enough to handle their end of the request processing.
It's important not to over-explain things, or include a lot of redundant
information. The same goes for types of hyperlinks. It may be possible to
reduce the complexity of documents that describe relationships between
resources by assuming that clients already know what kind of relationship
they're looking for and what they can infer from a relationships existence.
I think we'll continue to find as we have found in recent times that untyped
lists are
most of what you want from a linking document, and that using RDF's ability to
create new and arbitrary predicates
is often overkill. My guide for deciding how much information to include is
to think about those men in the middle who are neither client nor server.
Think about which ones you'll support and how much they need to know. Don't
dumb it down the for the sake of anyone else. Server doesn't care, and Client
already knows.
Benjamin