Sound Advice - patterns

Tales from the homeworld

My current feeds

Published: Sat Jul 5 21:43:39 EST 2008

Updated: Sat Nov 8 14:38:45 EST 2008

PUT

Intent

Request that a defined set of server-owned information be created or replaced based on client-provided content. The content is transmitted in a form the server understands.

Motivation

A business to business order processing system requires a reliable means of submitting, updating, and cancelling purchase orders. The system components at either end were developed by separate vendors to fit with each company's IT infrastructure. Both customer and supplier businesses require their side of the system to interoperate with other suppliers and with other customers they may have, respectively.

Each business upgrades their capabilities in line with their own business objectives. This often means that early adopters have moved on several times before the most conservative businesses make a jump to the latest approach and standards. In the mean-time, any given component on the server or client side must face an eclectic mix of counterparts and must interoperate with them correctly.

The PUT pattern provides a clean client/server separation that is able to survive independent upgrades of each over time, minimise traffic and processing waste, and deals with possible errors. Unlike GET, this is not a data synchronisation exercise for the client. The client is not trying to acquire a current snapshot of server state for processing. Instead, it is transferring information to the server which it may thereafter quickly forget. PUT is a pattern for information hand-over.

Applicability

PUT is appropriate whenever a client wants to create or replace the whole of the information behind a known and predetermined URL, and can decide when it wants to issue the request. Here are some common means by which a URL is discovered:

  1. Direct entry allows a user to enter the URL through an input device
  2. Configuration allows a document to be prepared ahead of time with links that have particular meaning to the client
  3. Hyperlinking is a generalisation of configuration. The document that contains meaningful links may be acquired from anywhere, including an earlier completed GET request
  4. Construction is the assembly of information available to the client into a URL format agreed with the server. This may be achieved by populating a form supplied by the server in an earlier GET request. This is a particularly useful approach when combined with a globally-unique identifier for the reliable automated submission of new data to a server.

Structure

PUT pattern structure

Participants

Client
  • Keeps a URL that lets it access the Server
  • Is capable of encoding its information in any form that it can legally and reasonably be encoded to.
  • Selects the most appropriate encoding based based on an initial guess. Subsequent requests are based on the on the supplied weighted acceptable types list if a Type Not Understood response is returned.
  • Issues the PUT or DELETE request. DELETE is a special case of PUT that is equivalent to a PUT of the null state.
  • Is responsible for overall successful execution of the operation, including modifications to the request and resubmissions of the request
  • Treats a lost response as equivalent to a Resubmit response with no required changes
  • Aborts the operation on a failure response, on a resubmission response that cannot or will not be satisfied, or on a lost response after too many retries.
Server
  • Checks that the type of the document is understood before performing significant processing
  • Selects the information to update based on the supplied URL
  • (optional) Is configured with mechanism to require the client to resubmit their request with or without modifications
  • Allows the PUT to operate initially
  • Guarantees that the PUT is safe to repeat. That is: Server updates its state to match a PUT initially. However, a PUT of the same state must not be interpreted as a request to modify state. It should return a Success response without any further action.
  • Is capable of parsing all forms that the data might be encoded in that are semantically rich enough to use
  • Selects the right parser implementation to use based on the returned document type
  • Returns a Success response only once the information update can be considered permanent, allowing the client to forget it. The definition of permanent will depend on the possible consequences of client forgetfulness. It would typically range from "information updated on disk" to "information replicated to all sites, and stored to backup media" for important data.

Collaboration

  • Client issues requests to Server via the Request Interface, modifying and resubmitting its request as needed until:
    1. A success response is elicited
    2. The client is unable to make changes required by a Resubmit response
    3. Client policy prevents either changes required by a Resubmit response, or further resubmissions in general

Consequences

The PUT pattern introduces a Uniform Interface for handing over identified sets of information from client to server. Clients and servers of different ages can communicate without impediment, and communication failures can be overcome.

The PUT pattern may require a change of thinking by developers with an imperative messaging background. PUT is not a request to make a particular change or particular kind of change to a set of data to reach an undefined end state. Instead, it is a request to make a set of data match a defined end state without specifying how the transition should be made. This is preferable to the imperative approach when it comes to network communications, because requests can be accidentally submitted multiple times without changing the meaning of the original request. Each subsequent request is interpreted as: "Please make no changes". This feature is called idempotency, and is necessary for dealing with any Response Lost message.

The use of an acceptable types list in a Types Not Understood response means that clients and servers built during different phases of the architecture will generally be able to communicate. Document-based communication has a degree of flexibility built in with must-ignore parameters. The acceptable types list fills a gap when incompatible changes occur to the set of document types, for example a new type deprecates an old type such as atom depreciating rss for news feed syndication.

An explicit failure response allows problems in the architecture to be reported and repaired as required. The resubmit feature allows temporary or permanent changes to the architecture to be accommodated by components without explicit reconfiguration, simplifying management. Note, however, the potential security implications of allowing one component to reconfigure others. A predefined policy for which modifications are permitted and which are to be treated as failure cases can be useful in security-sensitive environments. Incorrect reconfiguration of PUT requests can lead to the client issuing incorrect requests to other Servers, so policy should generally be tighter than that enforced for GET requests.

The potential exists in common transports such as HTTP for requests sent down parallel TCP connections or pipelined requests to be processed in a different order to that in which they were issued by Client. This could cause the server to use the wrong update, while the client sees only success responses. A simple solution is to hold off sending a PUT request to a given URL when the previous related PUT has not yet returned.

A client that is holding off sending the next PUT request should queue the first such request for the identified URL. After this point it should replace the queued PUT with each new request to the URL until the previous has been transmitted. There is no point queuing up multiple PUT requests for the same URL. The latest request should completely replace the effect of any previous one. This directs the server to transition its information directly from its starting state to its end state. In doing so, it can provide any internal optimisations available for efficient processing.

Twin consequences of the PUT pattern are that interim states at URLs may be lost, and that the architecture as a whole does not become overloaded as it is put under stress. Clients voluntarily discard intermediate states, so a server might have to make several internal transitions in order to catch up.

The flip-side of this behaviour is that servers are never stuck processing old data. They come completely up to date quickly. Many algorithms for real-time processing will behave better under this scenario than if they are fed through old requests. The pattern can be adapted to PUT to a new URL for each request for algorithms that are sensitive to discarding of interim states.

Implementation

PUT can be implemented with HTTP using the following mappings:

PUT(url, document, type)
PUT url HTTP/1.1
Content-Type: type
Expect: 100-continue

document

DELETE (PUT null) is a DELETE request to the URL in HTTP

Success()
Before reading document:
HTTP/1.1 100 Continue
After reading and processing document:
HTTP/1.1 200 OK

Note that 100 Continue handling is optional

All 2xx series response codes can be treated as Success responses for PUT. If the request was a DELETE, then 404 Not Found and 410 Gone are also treated as Success codes.

Type Not Understood()
HTTP/1.1 415 Unsupported Media Type
Accept: weighted acceptable types list
Fail(reason)
HTTP/1.1 400 Bad Request

reason

Unknown 1xx series response codes can be treated as a Fail for PUT. 3xx series codes that are not understood should be treated as Fail. 4xx series response codes are Fail, except for 401 Unauthorised and 407 Proxy Authentication Required. These are Resubmit responses and should only be treated as failures if they are not understood. 404 Not Found and 410 Gone are excluded from the failed codes list for DELETE requests, as they may be returned as the result of a duplicate request that has already succeeded. 5xx series responses should be treated as Fail, except for 503 Service Unavailable and 504 Gateway Timeout. These are Resubmit and Response Lost responses, respectively.

Resubmit(required changes)

Any of: 301 Moved Permanently, 302 Found, 303 See Other, 305 Use Proxy, 307 Temporary Redirect, 401 Unauthorized, or 407 Proxy Authentication Required.

Response Lost()

Any loss of communication before a response is received. This may include application or TCP/IP level timeouts, or an explicitly terminated connection. The 504 Gateway Timeout response is also equivalent to Response Lost, and indicates a loss occured somewhere past the TCP connection made directly by the client.

Sample Code

Request request;
request.url="http://example.com/publication-date/first-edition"
information = date(2008-07-05)
if (blocked())
{
	// Only queue the latest document
	// Overwrite any previous request
	request_pending(url) = information
}
else
{
	request.document_and_type = information.default_encode()

try_again:
	switch (request())
	{
	Success():
		// Do nothing. The update has completed.

	Type Not Understood(weighted acceptable types list):
		// Re-encode information in an acceptable form
		request.document_and_type =
			information.encode(
				weighted acceptable types list
				)
		jump try_again;

	Fail(reason):
		log(reason)

	Resubmit(required_changes):
		if policy(request, required_changes)
			request.modify(required_changes)
			jump try_again;
		else
			log("Policy forbids request modification");

	Response Lost():
		if policy(request, no required changes)
			jump try_again;
		else
			log("Too many retries");
	}
}

Known Uses

PUT is widely used less widely than GET on the Web, and is primarily a feature of automation.

Related Patterns