I have a long-standing interest in publish/subscribe protocols and technologies. In the proprietary system I work with professionally, publish/subscribe is the cornerstone of realtime data collection. Client machines are capable of displaying updates from monitored field equipment in latencies measured according the speed of light, plus a processing delays.
My implementation is proprietary, so I have long been keeping an eye out for promising standards and research that may emerge into something positive. The solution must be architecturally sound. In particular, it should be scalable to the size of the Internet. I have some thoughts about this which mainly stem back to the GENA protocol, Rohit Khare's dissertation Extending the REpresentational State Transfer Architectural Style for Decentralized Systems and my responses to it: Consensus on the Internet Scale, The Estimated Web, Routed REST, REST Trust Relationships, Infinite Buffering, and Use of HTTP verbs in ARREST architectural style.
I like the direct client to server nature of HTTP. You figure out who to connect to using DNS, then make a direct TCP/IP connection. Or indirect. For scalability purposes you can introduce intermediataries. These intermediataries are not confused about their role. It is to direct traffic on to the origin server. Sometimes this involves additional intermediataries, however these proxies are not expected to explicitly route data. That is a job for the network.
XMPP takes an instant-messenger approach to communications. JEP-0060 specifies a publish/subscribe mechanism for the XMPP protocol that apparently is seeing use as a transport for atom to notify interested parties when news feeds are updated. I don't mind saying that the fundamental architecture irks me. Instead of talking directly to an end server or being transparently pushed through layers that improve network performance, we start out with the assumption that we are talking to a XMPP server. This server could be anywhere. Chances are that unlike your web proxy, it is not being hosted by your ISP. Instead of measuring the request in terms of the speed of light between source and destination plus processing delays, we need to consider the speed of light and processing delays across a disorganised mishmash of servers from here to Antarctica. XMPP itself also appears to be a poor match to the REST architectural style. On the face of it, XMPP appears to have confusing identifier schemes, nouns, content types, and mish-mash of associated standards and extensions that remind me more of the WS-* stack than specifications or software stacks that are still used by the generation that follows their specifiers.
Nevertheless, GENA is dead outside of UPnP. The internet drafts submitted by Microsoft to the IETF don't match up with the specification that forms part of UPnP. Neither specification matches up to GENA implementations I have seen in wild. I think that the fundamental reason for this is not that HTTP forms a poor transport for subscription at a base technological level, but that firewalls are generally set up to make requests back from HTTP servers impossible as part of a subscription mechanism. As such, a protocol that already supports bidirectional communication and is acceptable to firewalls yields a better chance of ongoing success. For the moment, it is a technology that works on the small scale and in the wild Intenet today. Perhaps from that seed the organisational issue between servers will simply work itself out as the technology and associated traffic volume becomes more substantial and more important. After all, the web itself did not start out as the well-oiled reliable and high-performance machine it is today.
So, it seems reasonable that when it comes to rolling out a standards-based subscription mechanism today that JEP-0060 should be the preferred option ahead of trying to define and promote a HTTP-based specification. That said, there are a number of principles that must be transferrable to this XMPP-based solution:
- Summarisation. This is the organised discarding of information to ensure that slow clients recieve as much information as their connection characteristics permit.
- Differential flow control. A slow client should not prevent fast ones from getting updates.
- Localised resynchronisation. A client need not reach back to the origin server for the current resource status if its immediate server is already handling the subscription.
- Patch updates. For large resources (especially lists), the ability to deliver a message that indicates the change from last time, only. Not the whole state
- Security Measures. Pub/Sub can be a source of denial of service attacks. The subscrpition mechanism must be able to detect when its notifications are being treated as spam and end the subscription
In good RESTful style, subscriptions transfer a summarised sequence of the states of a resource. The first such state is the resource's state at the time the subscription request was recieved. This allows the state of the resource to be mirrored within a client and for the client to respond to changes in the resource's state. However it is reasonable to also consider subscription to transient data that is never retained as application state in any resource. This data has a null initial state, no matter when it is subscribed to.
Working through the XMPP protocol adds a great deal of complexity to the subscription relationship. Intermediataries handle the subscription, so they must also handle authorisation and other issues normally left out of the protocol to be handled within the origin server. In XMPP, the subscription effectively becomes a channel that certain users have a voice in and that other users can recieve messages from. My expertise is very thin about XMPP, but on the face of things it appears that subscription data is routed through a server that manages the particular channel, the pubsub service. Perhaps this service could be repaced with an origin server if that was desired.
In terms of matching up with my expectations of a subscription service, well... localised resynchronisation and patch updates can both be supported, but not at the same time. The pubsub service can forward the last message to a new subscriber. If that message contains the entire state of the resource, the client is synchronised. If it is a patch update, the client cannot synchronise. There does not appear to be a way to negotiate or inform the client of the nature of the update. "Message" appears to be the only recognised semantic. This is understandable, I suppose, and fits at least a niche of what a pubsub system can be expected to do.
Summarisation seems to be on the cards only at the edge of the network (i.e. the origin server). This is probably the best place for summarisation, however the lack of differential flow control is a concern. The server appears to simply send messages to the pubsub service at the rate that service can accept them. What happens from there is not clearly cemented in my mind. Either the rate is slowest to meet the slowed client, messages are buffered infintely (until the pubsub service crashes), or messages are buffered to a set limit and messages or clients are dropped past that point. There doesn't seem to be any way of reporting flow control back to the origin server in order to shape the summarisation activity at that point. If message dropping is occuring in the pubsub service then this should be more explicit. Other forms of summarisation may be preferrable to the wholesale discard of arbitrary messages.
JEP-0060 is long (really long) and full of inane examples. It is difficult to get a feel for what problems it does and does not solve. I doesn't contain text like "flow control", "loss", "missed", "sequence", "drop"... anything recongnisable as how the subscription model relates to the underlying transport's guarantees. Every time I look through it I feel like crying. Perhaps I am just missing the point, but when it comes to internet-scale subscription I don't think this document puts a standards-based solution in play.
I need to be able synchronise the state of a resource. I need the subscription mechanism to handle exceptional load or high latency situations effectively. I need it to be able to deal with thousands of changes per second across a dispirate client base even in my small example. On the Internet I expect it to deal with millions or billions of changes per second. Will a jabber-style network handle that kind of load without breaking client service guarantees? How are overflow conditions handled? Can messages be lost, reordered, or summarised? Are messages self-descriptive enough to allow summarisation by the pubsub server?
Perhaps I should go and pen an internet draft after all. GENA isn't that far off the mark, and really does work effectively when no firewalls are in the way. Perhaps it would be a useful mechanism to reliably and safely transfer data between jabber pubsub islands.
Benjamin