Sound advice - blog

Tales from the homeworld

My current feeds

Fri, 2006-Jan-27

Internet-scale Client Failover

Failover is the process of electing a new piece of hardware to take over the role of a failed piece of hardware (or sometimes software), and the process of bringing everyone on board with the new management structure. Detecting failure and electing a new master are not hard problems. Telling everyone about it is hard. You can attack the problem at various levels. You can have the new master take over the IP of the old and broadcast a new arp reply to take over from the MAC address. You can even have the new master take over the IP and MAC addres of the old. If new and old are not on the same subnet, you can try to solve the problem through DNS. The trouble with all of these approaches is that while they solve the problem for new clients that may come along, they don't solve the problem for clients with existing cached DNS entries or existing TCP/IP connections.

Imagine you are a client app, and you have sent a HTTP request to the server. The server fails over, and a new piece of hardware is now serving that IP address. You can still ping it. You can still see it in DNS. The problem is, it doesn't know about your TCP/IP connection to it, or the connection's "waiting for HTTP response" state. Until a new TCP/IP packet associated with the connection hits the new server it won't know you are there. Only when that happens and it returns a packet to that effect will the client learn its connection state is not reflected by the server side. Such a packet won't usually be generated until new request data is sent by the client, and often that just won't ever happen.

Under high load conditions clients should wait patiently to avoid putting extra strain on the server. If a client knows that a response will eventually be forthcoming it should be willing to wait for as long as it takes to generate the response. With the possibility of failover, the problem is that a client cannot know whether the server state reflects its own and cannot know whether a response really will be forthcoming or not. How often it must sample the remote state is determined by the desired failover time. In industrial applications the time may be as low as four or two seconds, and sampling must take place at a rate several times as quickly to allow for lost packets. If sampling is not possible the desired failover time represents the maximum time a server has to respond to its clients, plus network latency. Another means must be used to return the results of processing if any single request takes longer. Clients must use the desired failover time as their request timeout.

If you take the short request route, HTTP permits you to return 202 Accepted to indicate a request has been accepted for processing but without indicating success or failure of the request. If this were used as a matter of course, conventions could be set up to return the HTTP response via a request back to a call-back url. Alternatively, the response could be modelled as a resource on the server which is periodically polled by the client until it exhibits a success or failure status. Neither of these approaches is directly supported by today's browser software, however the latter could be performed using a little meta-refresh magic.

You may not have sufficient information at the application level to support sampling at the TCP/IP level. You would need to know the current sequence numbers of the stack in order to generate a packet that would be rejected by the server in an appropriate way. In practice what you need is a closer vantage point. Someone who is close in terms of network topology to both the old and the new master can easily tell when a failover occurs and publish that information for clients to monitor. On the face of it this just moving the problem around, however a specialised service can more easily ensure that it doesn't ever spend a long time responding to requests. This allows us to employ the techniques which rely on quick responses.

Like the state of http subscriptions, the state of http requests must be sampled if a client is to wait indefinately for a response. How long it should wait depends on the client's service guarantees, and has little to do with what the server considers an appropriate timeframe. Nevertheless, the client's demands put hard limits on the profile of behaviour acceptable on the server side. In subscription the server can simply renew whenever a renew is requested of it, and time a subscription out after a long period. It seems that the handling of a simple request/response couples clients and servers together more closely than even a subscription does, because of the hard limits client timeout puts onto the server side.

Benjamin