A few weeks back I put forward my first attempt at doing HTTP subscription for use in fast-moving REST environments. I wasn't altogether happy with it at the time, and the result was a bit half-arsed... but this is blogging after all.
I thought I'd give it another go, based on the same set of requirements as I used last time. This time I thought I'd try and benefit clients and servers that don't use subscription as well as those that do.
One of the major simplifying factors in HTTP has been the concept of a simple request-response pair. Subscription breaks this model by sending multiple responses for a single request. I have called this concept in the past a "multi-callback", which usually indicates a single response with a marker that it is not the last one the client should expect to receive. In it's original incarnation HTTP performed its exchange over a single TCP/IP connection, increasing overhead but again promoting simplicity. In HTTP/1.1 the default behaviour became to "hang on" to a connection and to allow pipelining (multiple requests sent before any response is received) to reduce dependence on a low-latency connection for high performance. Without pipelining it takes at least nxlatency to make n requests.
One restriction that can still affect HTTP pipelining performance is the requirement HTTP has to return all responses in the order requests were made. This may be fine when you're serving static data such as files, but if you are operating as a proxy to a legacy system you may have to make further requests to that system in order to fulfil the HTTP request. In the mean-time, other requests that could be made in parallel to the legacy system could be either backing up in the request pipe or could have been completed but are waiting on a particularly slow legacy system response to be returned via HTTP before they themselves can be.
This brings me to my first recommendation: Add a request id header. Specifically,
14.xx Request-Id
The Request-Id field is used both as a request-header and a response-header. When used as a request-header it provides a token the server SHOULD return in its response using the same header name. If a Request-Id is supplied by a client the client MUST be able to recieve the response out of order, and the server MAY return responses to identified requests out of order. Fairness algorithms SHOULD be used on the server to ensure every identified request is eventually dealt with.
A client MUST NOT reuse a Request-Id until its transaction with the server is complete. A SUBSCRIBE request MUST include a Request-Id field.
I've been tooing and froing over this next point. That is, how do we identify that a particular response is not the end of a transaction? Initially I said that 1xx series response code should be used, but that has its problems. For one, the current HTTP/1.1 standard says that 1xx series responses can't ever have bodies. That's maybe not the final nail in the coffin, but it doesn't help. The main reason I'm wavering from the point, though, is that a 1xx series response just aint very informative.
Consider the case where a resource is temporarily 404 (Not Found), but the server is still able to notify a client when it comes back into existence as a 200 (OK). The subscription should be able to convey that kind of information. I've therefore decided to reverse my previous decision and make the subscription indicate its non-completeness through a header. This has some precedent with the "Connection: close" header used to indicate a HTTP/1.1 server doesn't support pipelining.
Therefore, I would add something like the following:
14.xx Request
The Request general-header field allows the sender to specify options that are desired for a particular request over a specific connection and MUST NOT be communicated by proxies over further connections.
The Request header has the following grammar:
Request = "Request" ":" 1#(request-token) request-token = token
HTTP/1.1 proxies MUST parse the Request header field before a message is forwarded and, for each request-token in this field, remove any header field(s) from the message with the same name as the request-token. Request options are signaled by the presence of a request-token in the Request header field, not by any corresponding additional header field(s), since the additional header field may not be sent if there are no parameters associated with that request option.
Message headers listed in the Request header MUST NOT include end-to-end headers, such as Cache-Control.
HTTP/1.1 defines the "end" request option for the sender to signal that no further responses to the request will be sent after completion of the response. For example,
Connection: end
in either the request or response header fields indicates that the SUBSCRIBE request transaction is complete. It acts both as a means for a server to indicate SUBSCRIBE transaction completion and for a client to indicate a subscription is no longer required.
A system receiving an HTTP/1.0 (or lower-version) message that includes a Request header MUST, for each request-token in this field, remove and ignore any header field(s) from the message with the same name as the request-token. This protects against mistaken forwarding of such header fields by pre-HTTP/1.1 proxies.
Benjamin