Sound advice - patterns

Tales from the homeworld

My current feeds

Sat, 2008-Jul-05

Published: Sat Jul 5 21:43:39 EST 2008

Updated: Sat Jul 5 21:43:39 EST 2008

PUT

Intent

Request that a defined set of information be created or replaced based on client-provided content in a form the server understands.

Motivation

Servers often wish to expose the ability for clients to alter server-side information. This alteration might take the form of booking an airline flight, updating an existing journal entry, taking down a warning notice, or almost any other interaction. Machine communication can generally be expressed as the transfer of information, and that is the way that the PUT pattern fills a great range of possible application requirements.

The PUT pattern shares a number of features with the GET pattern. The server wants to expose its information by a method that does not reveal how information updates are processed, without having to keep track of the clients, and without introducing unnecessary coupling with the client.

The server wants to be able to support clients of different ages, some of whom will share the latest an greatest understanding of how to encode and parse particular kinds of data. Others will be left with legacy implementations. Likewise, the server may have been deployed for some time and may be running a legacy implementation while upgraded clients are present in the architecture.

The PUT pattern provides a clean client/server separation that is able to survive independent upgrades of each over time, minimise traffic and processing waste, and deals with possible errors. Unlike GET, this is not a data synchronisation exercise for the client. The client is not trying to acquire a current snapshot of server state for processing. Instead, it is handing information to the server which it may thereafter quickly forget. PUT is a pattern for information hand-over.

Applicability

PUT is appropriate whenever a client wants to update the whole of the information behind a known URL, and can decide when it wants to issue the request. Here are some common means by which a URL is discovered:

  1. Direct entry allows a user to enter the URL through an input device
  2. Configuration allows a document to be prepared ahead of time with links that have particular meaning to the client
  3. Hyperlinking is a generalisation of configuration. The document that contains meaningful links may be acquired from anywhere, including an earlier completed GET request
  4. Construction is the assembly of information available to the client into a URL format agreed with the server. This may be achieved by populating a form supplied by the server in an earlier GET request. This is a particularly useful approach when combined with a globally-unique identifier for the reliable automated submission of new data to a server.

Structure

PUT pattern structure

Participants

Client
  • Keeps a URL that lets it access the Server
  • Is capable of encoding its information in any form that it can legally and reasonably be encoded to.
  • Selects the most appropriate encoding based based on an initial guess. Subsequent requests are based on the on the supplied weighted acceptable types list if a Type Not Understood response is returned.
  • Issues the PUT or DELETE request. DELETE is a special case of PUT that is equivalent to a PUT of the null state.
  • Is responsible for overall successful execution of the operation, including modifications to the request and resubmissions of the request
  • Treats a lost response as equivalent to a Resubmit response with no required changes
  • Aborts the operation on a failure response, on a resubmission response that cannot or will not be satisfied, or on a lost response after too many retries.
Server
  • Checks that the type of the document is understood before performing significant processing
  • Selects the information to update based on the supplied URL
  • (optional) Is configured with mechanism to require the client to resubmit their request with or without modifications
  • Allows the PUT to operate initially
  • Guarantees that the PUT is safe to repeat. That is: Server updates its state to match a PUT initially. However, a PUT of the same state must not be interpreted as a request to modify state. It should return a Success response without any further action.
  • Is capable of parsing all forms that the data might be encoded in that are semantically rich enough to use
  • Selects the right parser implementation to use based on the returned document type
  • Returns a Success response only once the information update can be considered permanent, allowing the client to forget it. The definition of permanent will depend on the possible consequences of client forgetfulness. It would typically range from "information updated on disk" to "information replicated to all sites, and stored to backup media" for important data.

Collaboration

  • Client issues requests to Server via the Request Interface, modifying and resubmitting its request as needed until:
    1. A success response is elicited
    2. The client is unable to make changes required by a Resubmit response
    3. Client policy prevents either changes required by a Resubmit response, or further resubmissions in general

Consequences

The PUT pattern introduces a Uniform Interface for handing over identified sets of information from client to server. Clients and servers of different ages can communicate without impediment, and communication failures can be overcome.

The PUT pattern may require a change of thinking by developers with an imperative messaging background. PUT is not a request to make a particular change or particular kind of change to a set of data to reach an undefined end state. Instead, it is a request to make a set of data match a defined end state without specifying how the transition should be made. This is preferable to the imperative approach when it comes to network communications, because requests can be accidentally submitted multiple times without changing the meaning of the original request. Each subsequent request is interpreted as: "Please make no changes". This feature is called idempotency, and is necessary for dealing with any Response Lost message.

The use of an acceptable types list in a Types Not Understood response means that clients and servers built during different phases of the architecture will generally be able to communicate. Document-based communication has a degree of flexibility built in with must-ignore parameters. The acceptable types list fills a gap when incompatible changes occur to the set of document types, for example a new type deprecates an old type such as atom depreciating rss for news feed syndication.

An explicit failure response allows problems in the architecture to be reported and repaired as required. The resubmit feature allows temporary or permanent changes to the architecture to be accommodated by components without explicit reconfiguration, simplifying management. Note, however, the potential security implications of allowing one component to reconfigure others. A predefined policy for which modifications are permitted and which are to be treated as failure cases can be useful in security-sensitive environments. Incorrect reconfiguration of PUT requests can lead to the client issuing incorrect requests to other Servers, so policy should generally be tighter than that enforced for GET requests.

Note that it may be necessary to avoid allowing multiple PUT requests to be outstanding simultaneously to the same or related URLs. The potential exists in common transports such as HTTP for simultaneous or pipelined requests to be processed in a different order to that in which they were actually submitted by the Client.

Whatever mechanism the underlying communications system supports, there will come some limit at which a particular PUT request cannot be sent immediately and is queued. A subsequent PUT request from the same logical client may be issued before the first PUT clears the queue. If this happens, the correct behaviour is to discard the first PUT request.

The reason for this is that the second PUT request is intended to replace the effect of the first request. Instead of requiring the server to process old requests, it can be instructed by the newest request to transition its information directly from its starting state to its end state. In doing so, it can provide any internal optimisations available for efficient processing.

Implementation

PUT can be implemented with HTTP using the following mappings:

PUT(url, document, type)
PUT url HTTP/1.1
Content-Type: type
Expect: 100-continue

document

DELETE (PUT null) is a DELETE request to the URL in HTTP

Success()
Before reading document:
HTTP/1.1 100 Continue
After reading and processing document:
HTTP/1.1 200 OK

Note that 100 Continue handling is optional

All 2xx series response codes can be treated as Success responses for PUT. If the request was a DELETE, then 404 Not Found and 410 Gone are also treated as Success codes.

Type Not Understood()
HTTP/1.1 415 Unsupported Media Type
Accept: weighted acceptable types list
Fail(reason)
HTTP/1.1 400 Bad Request

reason

Unknown 1xx series response codes can be treated as a Fail for PUT. 3xx series codes that are not understood should be treated as Fail. 4xx series response codes are Fail, except for 401 Unauthorised and 407 Proxy Authentication Required. These are Resubmit responses and should only be treated as failures if they are not understood. 404 Not Found and 410 Gone are excluded from the failed codes list for DELETE requests, as they may be returned as the result of a duplicate request that has already succeeded. 5xx series responses should be treated as Fail, except for 503 Service Unavailable and 504 Gateway Timeout. These are Resubmit and Response Lost responses, respectively.

Resubmit(required changes)

Any of: 301 Moved Permanently, 302 Found, 303 See Other, 305 Use Proxy, 307 Temporary Redirect, 401 Unauthorized, or 407 Proxy Authentication Required.

Response Lost()

Any loss of communication before a response is received. This may include application or TCP/IP level timeouts, or an explicitly terminated connection. The 504 Gateway Timeout response is also equivalent to Response Lost, and indicates a loss occured somewhere past the TCP connection made directly by the client.

Sample Code

Request request;
request.url="http://example.com/publication-date/first-edition"
information = date(2008-07-05)
if (blocked())
{
	// Only queue the latest document
	// Overwrite any previous request
	request_pending(url) = information
}
else
{
	request.document_and_type = information.default_encode()

try_again:
	switch (request())
	{
	Success():
		// Do nothing. The update has completed.

	Type Not Understood(weighted acceptable types list):
		// Re-encode information in an acceptable form
		request.document_and_type =
			information.encode(
				weighted acceptable types list
				)
		jump try_again;

	Fail(reason):
		log(reason)

	Resubmit(required_changes):
		if policy(request, required_changes)
			request.modify(required_changes)
			jump try_again;
		else
			log("Policy forbids request modification");

	Response Lost():
		if policy(request, no required changes)
			jump try_again;
		else
			log("Too many retries");
	}
}

Known Uses

PUT is widely used less widely than GET on the Web, and is primarily a feature of automation.

Related Patterns

Sat, 2008-Jul-05

Published: Wed Jul 2 20:38:52 EST 2008

Updated: Sat Jul 5 21:53:36 EST 2008

GET

Intent

Transfer a defined set of information from its owner to an anonymous client in a form the client understands.

Motivation

Servers often wish to expose information for general client consumption exposing the method by which that information is produced, without having to keep track of the clients, and without introducing unnecessary coupling with the client.

The server wants to be able to support clients of different ages, some of whom will share the latest an greatest understanding of how to encode and parse particular kinds of data. Others will be left with legacy implementations. Likewise, the server may have been deployed for some time and may be running a legacy implementation while upgraded clients are present in the architecture.

Clients do not wish to create unnecessary load on the server, so look for ways in which they can minimise both traffic and processing. Clients also need to be able to deal with possible error conditions, including communication failures.

The GET pattern provides a clean client/server separation that is able to survive independent upgrades of each over time, control over traffic and processing waste, and deals with possible errors.

Applicability

GET is appropriate whenever a client wants to acquire the whole of the information behind a known URL, and can decide when it wants to issue the request (subject to a cache miss). Here are some common means by which a URL is discovered:

  1. Direct entry allows a user to enter the URL through an input device
  2. Configuration allows a document to be prepared ahead of time with links that have particular meaning to the client
  3. Hyperlinking is a generalisation of configuration. The document that contains meaningful links may be acquired from anywhere, including an earlier completed GET request
  4. Construction is the assembly of information available to the client into a URL format agreed with the server. This may be achieved by populating a form supplied by the server in an earlier GET request.

Methods of determining when to issue a GET request include:

  1. One-shot, the issuing of a request at a predefined time, when the URL first becomes known, or when the client needs access to the information
  2. Cyclic, the issuing of a request at a predefined rate while the client is active
  3. On cache expiry, the issuing of a request whenever a cache entry expires. Note that this requires the server to set a maximum age on cache entries, something that is not always provided.
  4. Otherwise-triggered, the issuing of a request based on some form of back-channel that indicates information at a given URL may have changed

Structure

GET pattern structure

Participants

Client
  • Keeps a URL that lets it access the Server
  • Issues the GET request
  • Is capable of parsing all forms that the data might be encoded in that are semantically rich enough to use
  • Selects the right parser implementation to use based on the returned document type
  • (optional) Retains a cache of past successful GET responses and their related cache control information
  • Is responsible for overall successful execution of the operation, including modifications to the request and resubmissions of the request
  • Treats a lost response as equivalent to a Resubmit response with no required changes
  • Aborts the operation on a failure response, on a resubmission response that cannot or will not be satisfied, or on a lost response after too many retries.
Server
  • Evaluates any provided condition before performing significant processing
  • Selects the information to return based on the supplied URL
  • (optional) Is configured with mechanism to require the client to resubmit their request with or without modifications
  • Guarantees that the GET request is safe initially (is not interpreted as a request to modify state), and safe to repeat.
  • Is capable of encoding its information in any form that it can legally and reasonably be encoded to.
  • Selects the most appropriate encoding based on the supplied weighted acceptable types list and any preference it may have itself, and returns the document in that format

Collaboration

  • Client issues requests to Server via the Request Interface, modifying and resubmitting its request as needed until:
    1. A success response is elicited
    2. The request condition is not met, meaning that the cached response is still valid
    3. A failure response is elicited
    4. The client is unable to make changes required by a Resubmit response
    5. Client policy prevents either changes required by a Resubmit response, or further resubmissions in general

Consequences

The GET pattern introduces a Uniform Interface for transferring identified sets of information from server to client. Clients and servers of different ages can communicate without impediment, and communication failures can be overcome.

The use of an acceptable types list in a GET request means that clients built during different phases of the architecture will generally be able to communicate. Document-based communication has a degree of flexibility built in with must-ignore parameters. The acceptable types list fills a gap when incompatible changes occur to the set of document types, for example a new type deprecates an old type such as atom depreciating rss for news feed syndication.

An explicit failure response allows problems in the architecture to be reported and repaired as required. The resubmit feature allows temporary or permanent changes to the architecture to be accommodated by components without explicit reconfiguration, simplifying management. Note, however, the potential security implications of allowing one component to reconfigure others. A predefined policy for which modifications are permitted and which are to be treated as failure cases can be useful in security-sensitive environments.

Note that it may be necessary to avoid allowing multiple GET requests to be outstanding simultaneously to the same or related URLs. The potential exists in common transports such as HTTP for simultaneous or pipelined requests to be processed in a different order to that in which they actually return to Client.

Correct behaviour for a client that wishes to issue a GET request but is not currently permitted to is to queue the first request for the identified URL. When the communications infrastructure supports another GET being sent, the queued GET is issued. There is no point queuing up multiple GET requests for the same URL. If the first request has not been issued by the time motivation to issue a second request comes around, a single request will fulfil the motivation behind both.

Twin consequences of the GET pattern are that interim states at URLs may be missed, and that the architecture as a whole does not become overloaded as the architecture is put under stress. Each GET retrieves the current state of the resource, so rapid changes may see the next GET arrive several changes after an earlier GET. These states will be lost unless an additional buffering mechanism is employed. The client will read back the current state rather than the old transitional states.

The flip-side of this behaviour is that clients are never stuck reading old data. They come completely up to date quickly and process the latest information available. Many algorithms for real-time processing will behave better under this scenario than if they are fed through old changes. The GET pattern can be adapted to a buffering model for algorithms that suffer from losses of interim states.

Implementation

GET can be implemented with HTTP using the following mappings:

GET(url, condition, weighted acceptable types list)
GET url HTTP/1.1
Accept: weighted acceptable types list
If-condition
Success(document, type, cache)
HTTP/1.1 200 OK
Content-Type: type
Cache-Control: cache

document

All 2xx series response codes can be treated as Success responses for GET

Condition Not Met()
HTTP/1.1 304 Not Modified
Fail(reason)
HTTP/1.1 400 Bad Request

reason

Unknown 1xx series response codes can be treated as a Fail for GET. 300 Multiple Choices is a non-implementable Resubmit response for automated clients, so should also be treated as Fail alongside other 3xx series codes that are not understood. 4xx series response codes are Fail, except for 401 Unauthorised and 407 Proxy Authentication Required. These are Resubmit responses and should only be treated as failures if they are not understood. 5xx series responses should be treated as Fail, except for 503 Service Unavailable and 504 Gateway Timeout. These are Resubmit and Response Lost responses, respectively.

Resubmit(required changes)

Any of: 301 Moved Permanently, 302 Found, 303 See Other, 305 Use Proxy, 307 Temporary Redirect, 401 Unauthorized, or 407 Proxy Authentication Required.

Response Lost()

Any loss of communication before a response is received. This may include application or TCP/IP level timeouts, or an explicitly terminated connection. The 504 Gateway Timeout response is also equivalent to Response Lost, and indicates a loss occured somewhere past the TCP connection made directly by the client.

Sample Code

Request request;
request.url="http://example.com/publication-dates"
if cache_manager.fresh(request.url)
{
	// Do nothing. Our cache entry is still fresh.
}
else if (blocked())
{
	// Only queue one request for the URL
	request_pending(url) = true
}
else
{
try_again:
	request.accept=parser.accept
	request.condition=cache_manager.condition(request.url)

	switch (request())
	{
	Success(document, type, cache):
		cache_manager.update(document, type, cache)
		process(parser(document, type))

	Condition Not Met():
		// Do nothing.
		// We have already processed the
		// latest data with our last request.

	Fail(reason):
		log(reason)

	Resubmit(required_changes):
		if policy(request, required_changes)
			request.modify(required_changes)
			jump try_again
		else
			log("Policy forbids request modification")

	Response Lost():
		if policy(request, no required changes)
			jump try_again
		else
			log("Too many retries")
	}
}

Known Uses

GET is widely used on the Web, both under direct human control and under automation. Various aspects of GET are not always used well.

Related Patterns