Sound advice - blog

Tales from the homeworld

My current feeds

Fri, 2005-Sep-30

The Estimated Web

Rohit Khare's dissertation describes the ARREST+E architectural style for building estimated systems on a web-like foundation. He derives this architecture based on styles developed to reduce the time it takes for a client and server to reach consensus on a value (they have the same value). The styles he derives from are based on leases of values which mirror the web's cache expiry model. An assumption of this work is that servers must not change their value until all caches have expired, which is to say that the most recent expiry date provided to any client has passed. He doesn't explicitly cover the way the web already works as an estimated system: by lying about the expiry field.

Most applications on the web do not treat the expirty date as a lease that must be honoured. Instead, they change whenever they want to change and provide an expiry date to clients that represents a value much less than they expect the real page to change at. Most DNS server records remain static for at least months at a time, so an expiry model that permits caching for 24 hours saves bandwith while still ensuring that clients have a high probability of having records that are correct. Simply speaking, if a DNS record changes once every thirty days then a one-day cache expiry gives clients something like a 29 in 30 chance of having data that is up to date. In the mean time caching saves each DNS server from having to answer the same queries every time someone wants to look up a web page or send some email.

The question on the web of today becomes "What will it cost me to have clients with the wrong data?". If you're managing a careful transition then both old and new DNS records should be valid for the duration of the cache expiry. In this case it costs nothing for clients to have the old data. Because it costs them nothing it also costs you nothing, excepting the cost of such a careful transition. When we're talking about web resources the problem becomes more complicated because of the wide variety of things that a web resource could represent. If it is time-critical data of commercial importance to your clients then even short delays in getting the freshest data can be a problem for clients. Clients wishing to have a commercial advantage over each other are likely to poll the resource rapidly to ensure they get the new result first. Whether cache expiry dates are used or not clients will consume your bandwidth and processing power in proportion to their interest.

The ARREST+E style provides an alternative model. Clients no longer need to poll because you're giving them the freshest data possible as it arrives. ARREST+E invovles subscription by clients and notification back to the clients as data changes. It allows for summarisation of data to avoid irrelevant information from being transmitted (such as stale updates) and also for prediction on the client side to try and reduce the estimated error in their copy of the data. If your clients simply must know the result and are willing to compete with each other by effectively launching denial of service attacks on your server then the extra costs of ARREST+E style may be worth it. Storing state for each subscription (including summariser state) may be cheaper than handling the excess load.

On the other hand, most of the Internet doesn't work this way. Clients are polite because they don't have a strong commercial interest in changes to your data. Clients that aren't polite can be denied access for a while until their manners improve. A server-side choice of how good an estimate your copy of the resource representation will be is enough to satisfy everyone.

Whether or not you use ARREST+E with its subscription model some aspects of the architecture may still be of some use. I periodically see complaints about how inefficient HTTP is at accessing RSS feeds. A summariser could reduce the bandwith associated with transferring large feeds. It would be nice to be able to detect feed reader clients to a resource specifically tailored for them each time they arrive. Consider the possibility of returning today's feed data to a client, but directing them next time to a resource that represents only feed items after today. Each time they get a new set of items they would be directed onto a resource that represents items since that set.

Benjamin

Fri, 2005-Sep-30

Consensus on the Internet Scale

I've had a little downtime this week and have used some of it to read a 2003 paper by Rohit Khare, Extending the REpresentational State Transfer (REST) Architectural Style for Decentralized Systems. As of 2002, Rohit was a founder and Chief Technology Officer of KnowNow, itself a commercialisation of technology originally developed for mod_pubsub. I noticed this paper through a Danny Ayers blog entry. I have an interest in publish subscribe systems and this paper lays the theoretical groundwork for the behaviour of such systems on high-latency networks such as the internet. Rohit's thesis weighs in at almost three hundred pages, including associated bibliography and supporting material. As such, I don't intend to cover all of my observations in this blog entry. I'll try to confine myself to the question of consensus on high-latency networks and on the "now" horizon.

One of the first things Rohit does in his thesis is establish a theoretical limit to the latency between two nodes and the rate at which data can change at a source node while being agreed at the sink node. He bases his model around the HTTP concept of expiry and calculates if it takes at most d seconds for the data to get from point A to point B, then A must hold its value constant until at least d seconds and note this period as an expiry in the message it sends. Clearly if A wants to change its value more frequently the data will be expired before it reaches B. This result guides much of the subsequent work as he tries to develop additional REST-like styles to achieve as close as possible to the theoretical result in practice. He ends up with A being able to change its value about every 2d if appropriate architectural components are included.

Two times maximum latency may not be terribly significant on simple short distance networks, however even on ethernet d is theoretically unbounded. It follows then that over a network it isn't possible to always have a fresh copy of an origin server's value. This is where things get interesting. Rohit defines the concept of an "estimated" system. He contrasts this from a centralised system by saying that an estimated system does not guarantee consensus between network nodes. Instead of saying "I know my copy of the variable is the same as the server holds, and thus my decisions will be equivalent to everyone else's", we say "I have a P% chance that my copy is the same as the server holds, and thus my decisions may differ from those of other network nodes". A "now" horizon exists between those components that have a low enough latency to maintain consensus with the origin server's data and those that are too far from the source of data to track changes at the rate they occur.

In SCADA systems we measure physical phenomenon. Even if we only sample that phenomenon at a certain rate, the actual source change can occur at any frequency. If we consider the device that monitors an electric current (our RTU) as the origin server we could attempt to synchronise with the RTU's data before it resamples. If latencies are low enough we could possibly indicate to the user of our system the actual value as we saw it within the "now" horizon and thus in agreement with the RTU's values. However, another way to see the RTU is as a proxy. Because we can't control the frequency of change of the current, the "now" horizon is arbitrarily small. All SCADA systems are estimated rather than consensus-based.

Rohit doesn't directly cover the possibility of an infinitely-small "now" horizon, however his ultimate architecture for estimated systems generally is ARREST+E (Asynchronous Routed REST+Estimation). It contains a client that retries requests to the field device when they fail in order to traverse unreliable networks. It includes credentials for the client to access the field device. It includes a trust manager that rejects bad credentials at the field device, and includes the capability to drop duplicate messages from clients that have retried when a retry wasn't required. The field device would send updates back to the client as often as the network permitted. To keep latency low it would avoid passing on stale data and may perform other summarisation activities to maximise the relevance of data returned to the client. On the other side of this returned data connection the client would include a predictor to provide the best possible estimate of actual state based on the stale data it has at its disposal. Finally, this return path also includes credential transmission and trust management.

This grand unifying model is practical when the number of clients for our field device is small. Unlike the base REST and HTTP models this architecture requires state to be maintained in our field device to keep track of subscriptions and to provide appropriate summaries. The predictor on the client side can be as simple as the null predictor (we assume that the latest value at our disposal is still current) or could include domain-specific knowledge (we know the maximum rate at which this piece of equipment can change state and we know the direction it is moving so we perform a linear extrapolation based on past data input). As I mentioned in my previous entry state may be too expensive to maintain on behalf of the arbitrarily large client base that exists on the Internet at large. There must be some means of aquiring the money to scale servers up appropraitely. If that end of things is secure, however, I think it is feasible.

Importantly, I think it is necessary. In practice I think that most mutable resources on the web are represnting physical phenomenon of one kind or another. Whether it is "the state the user left this document in last time they wrote it" or "the current trading price of MSFT on wall street", people are physical phenomenon who can act on software at any time and in any way. The existing system on the web of using expiry dates is actually a form of estimation that is driven by the server deciding how often a client should poll or recheck their data. This server-driven approach may not suit all clients, which is why web browsers need a "reload" button!

I'll talk some specific issues I had with the thesis in a subsequent post.

Benjamin