Sound advice - blog

My recent bookmarks

Tue, 2011-Dec-06

WS-REST 2012 Call for Papers

The Third International Workshop on RESTful Design (WS-REST 2012) aims to provide a forum for discussion and dissemination of research on the emerging resource-oriented style of Web service design.

Background

Over the past years, several discussions between advocates of the two major architectural styles for designing and implementing Web services (the RPC/ESB-oriented approach and the resource-oriented approach) have been mainly held outside of the traditional research and academic community. Mailing lists, forums and developer communities have seen long and fascinating debates around the assumptions, strengths, and weaknesses of these two approaches. The RESTful approach to Web services has also received a significant amount of attention from industry as indicated by the numerous technical books being published on the topic.

This third edition of WS-REST, co-located with the WWW2012 conference, aims at providing an academic forum for discussing current emerging research topics centered around the application of REST, as well as advanced application scenarios for building large scale distributed systems.

In addition to presentations on novel applications of RESTful Web services technologies, the workshop program will also include discussions on the limits of the applicability of the REST architectural style, as well as recent advances in research that aim at tackling new problems that may require to extend the basic REST architectural style. The organizers are seeking novel and original, high quality paper submissions on research contributions focusing on the following topics:

Applications of the REST architectural style to novel domains
Design Patterns and Anti-Patterns for RESTful services
RESTful service composition
Testing RESTful services (methods and frameworks)
Inverted REST (REST for push events)
Integration of Pub/Sub with REST
Performance and QoS Evaluations of RESTful services
REST compliant transaction models
Mashups
Frameworks and toolkits for RESTful service implementation
Frameworks and toolkits for RESTful service consumption
Modeling RESTful services
Resource Design and Granularity
Evolution of RESTful services
Versioning and Extension of REST APIs
HTTP extensions and replacements
REST compliant protocols beyond HTTP
Multi-Protocol REST (REST architectures across protocols)

All workshop papers are peer-reviewed and accepted papers will be published as part of the ACM Digital Library. Two kinds of contributions are sought: short position papers (not to exceed 4 pages in ACM style format) describing particular challenges or experiences relevant to the scope of the workshop, and full research papers (not to exceed 8 pages in the ACM style format) describing novel solutions to relevant problems. Technology demonstrations are particularly welcome, and we encourage authors to focus on lessons learned rather than describing an implementation.

Original papers, not undergoing review elsewhere, must be submitted electronically in PDF format. Templates are available here

Easychair page: https://www.easychair.org/conferences/?conf=wsrest2012

Important Dates

Abstract Submission: 3. February 2012
Paper Submission: 10. February 2012
Notification of Acceptance: 8. March 2012
WS-REST 2012 Workshop: 16. April 2012

Program Committee Chairs

Cesare Pautasso, Faculty of Informatics, USI Lugano, Switzerland
Erik Wilde, EMC, USA
Rosa Alarcon, Computer Science Department, Pontificia Universidad de Chile, Chile

Program Committee

Jan Algermissen, Nord Software Consulting, Germany
Subbu Allamaraju, Yahoo Inc., USA
Mike Amudsen, USA
Bill Burke, Red Hat, USA
Benjamin Carlyle, Australia
Stuart Charlton, Elastra, USA
Duncan Cragg, Thoughtworks, UK
Cornelia Davis, EMC, USA
Joe Gregorio, Google, USA
Michael Hausenblas, DERI, Ireland
Rohit Khare, 4K Associates, USA
Yves Lafon, W3C, USA
Frank Leymann, University of Stuttgart, Germany
Alexandros Marinos, Rulemotion, UK
Ian Robinson, Thoughtworks, UK
Sam Ruby, IBM, USA
Richard Taylor, UC Irvine, USA
Stefan Tilkov, innoQ, Germany
Steve Vinoski, Verivue, USA
Olaf Zimmermann, IBM Zurich Research Lab, Switzerland

Contact

WS-REST Web site: https://ws-rest.org/2012/

WS-REST Twitter: https://twitter.com/wsrest2012

WS-REST Email: ws-rest@lists.berkeley.edu

in links: google google blogsearch technorati delicious
[/blipverts] permanent link

Tue, 2011-Dec-06

Best Practices for HTTP API evolvability

REST is the architectural style of the Web, and closely related to REST is the concept of a HTTP API. A HTTP API is a programmer-oriented interface to a specific service, and is known by other names such as a RESTful service contract, resource-oriented architecture, or a URI Space.

I say closely related because most HTTP APIs do not comply with the uniform interface constraint in it's strictest sense, which would demand that the interface be "standard" - or in practice: Consistent enough between different services that clients and services can obtain significant network effects. I won't dwell on this!

One thing we know is that these APIs will change, so what can we do at a technical level to deal with these changes as they occur?

The Moving Parts

The main moving parts of a HTTP API are

The generic semantics of methods used in the API, including exceptional conditions and other metadata
The generic semantics of media types used in the API, including any and all schema information
The set of URIs that make up the API, including specific semantics each generic method and generic media types used in the API

These parts move at different rates. The set of methods in use tend to change the least. The standard HTTP GET, PUT, DELETE, and POST are sufficient to perform most patterns of interactions that may be required between clients and servers. The set of media types and associated schema change at a faster rate. These are less likely to be completely standard, so will often include local jargon that changes at a relatively high rate. The fastest changing component of the API is detailed definition of what each method and media type combination will do when invoked on the various URLs that make up the service contract itself.

Types of mismatches

For any particular interaction between client and server, the following combinations are possible:

The server and client are both built against a matching version of the API
The server is built against a newer version of the API than the client is
The client is built against a newer version of the API than the server is

In the first case of a match between the client and server versions, then there is no compatibility issue to deal with. The second case is a backwards- compatibility issue, where the new server must continue to work with old clients, at least until all of the old clients that matter are upgraded or retired.

Although the first two cases are the most common, the standard nature of methods and media types across multiple services means that the third combination is also possible. The client may be built against the latest version of the API, while an old service or an old server may end up processing the request. This is a forwards-compatibility issue, where the old server has to deal with a message that complies with a future version of the API.

Method Evolution

Adding Methods and Status

The addition of a new method may be needed under the uniform interface constraint to support new types of client/server interactions within the architecture. For HTTP these will likely be any type of interaction that inherently breaks one or more other REST constraints, such as the stateless constraint. However, new methods may be introduced for other reasons such as to improve the efficiency of an interaction.

Adding new methods does not impact backwards-compatibility, because old clients will not invoke the new method. It does impact forwards-compatibility because new clients will wish to invoke the new method on old servers. Additionally, changes to existing methods such as adding a new HTTP status code for a new exceptional condition can break backwards-compatibility by returning a message an old client does not understand.

Best Practice 1: Services should return 501 Not Implemented if they do not recognise the method name in a request

Best Practice 2: Clients that use a method that may not be understood by all services yet should handle 501 Not Implemented by choosing an alternative way of invoking the operation, or raising an exception towards their user in the case that no means of invoking the required operation now exists

Best Practice 3: A new method name should be chosen for a method that is not forwards-compatible with any existing method - i.e. a new method name should be chosen if the new features of the method must be understood for the method to be processed correctly (must understand semantics)

These best practice items deal with a new client that makes a request on and old server. If the server doesn't understand the new request method, it responds with a standard exception code that the client can use to switch to fallback logic or raise a specific error to their user. For example:

Client: SUBSCRIBE /foo
Server: 501 Not Implemented
Client: (falling back to a period poll) GET /foo
Server: 200 OK

Client: LOCK /foo
Server: 501 Not Implemented
Client: (unable to safely perform its operation, raises an exception)

Best Practice 4: Services should ignore headers they do not understand or the components of which they do not understand. Proxies should pass these headers on without modification or the components they do not understand without modification.

Best Practice 5: The existing method name should be retained and new headers or components of headers added when a new method is forwards-compatible with an existing method

These best practice items deal with a new client that makes a request on an old server, but the new features of the method are a refinement of the existing method such as a new efficiency improvement. If the server doesn't understand the new nuances of the request it will treat it as if it were the existing legacy request, and although it may perform suboptimally will still produce a correct result.

Best Practice 6: Clients should handle unknown exception codes based on the numeric range they fall within

Best Practice 7: A new status should be assigned a status code within a numeric range that identifies a coarse-grained understanding of the condition that already exists

Best Practice 8: Clients should ignore headers they do not understand or the components of which they do not understand. Proxies should pass these headers on without modification or the components they do not understand without modification

Best Practice 9: If a new status is a subset of an existing status other than 400 Bad Request or 500 Internal Server Error then refine the meaning of the existing status by adding information to response headers rather than assigning a new status code.

These best practice items deal with a new server sending a new status to the client, such as a new exception.

Removing Methods and Status

Removing an existing method introduces a backwards compatibility problem where clients will continue to request the old method. This has essentially the same behaviour as adding a new method to a client implementation that is not understood by an old service, with the special property that the client is less likely to have correct facilities for dealing with the 501 Not Implemented exception. Thus, methods should be removed with care and only after surveying the population of clients to ensure no ill effects will result.

Removing an existing status within a new client implementation before all server implementations have stopped using the code or variant has similar properties to adding a new status. The same best practice items apply.

Media Type Evolution

Adding Information

Adding information conveyed in media types and their related schemas has an impact on the relationship between the sender of the document and the recipient of the document. Unlike methods and status which are asymmetrical between client and server, media types are generally suitable to travel in either direction as the payload of a request or response. For this reason in this section we won't talk about client and server, but of sender and recipient.

Adding information to the universe of discourse between sender and recipient of documents means either modifying the schema of an existing media type, or introducing a new media type to carry the new information.

Best Practice 10: Document recipients should ignore document content that they do not understand. Proxies and databases should pass this content on without modification.

Best Practice 11: Validation of documents that might fail to meet Best Practice item 10 should only occur if the validation logic is written to the same version of the API as the sender of the document, or a later version of the API

Best Practice 12: If the new information can be added to the schema of an existing media type in alignment with the design objectives of that media type then it should so added

For XML media types this means that recipients processing a given document should treat unexpected elements and attributes in the document as if they were not present. This includes the validation stage, so an old recipient shouldn't discard a document just because it has new elements in it that were not present at the time its validation logic was designed. The validation logic needs to be:

Performed on the sender side, rather than the recipient side
Performed on the recipient side only if the document indicates a version number that the recipient knows is equal to or older than its validation logic, or
Performed on the recipient side only after it has checked to ensure its validation logic is up to date based on the latest version of the media type specification

With these best practice items in place, new information can be added to media type schemas and to corresponding documents. Old recipients will ignore the new information and new recipients are able to make use of it as appropriate. Note that information can still only be added to schemas in ways consistent with the "ignore" rules of existing recipients. If the ignore rule is to treat unknown attributes and elements as if they do not exist, then new extensions must be in the form of new attributes and elements. If they cannot be made in compliance with the existing ignore rules then the change becomes incompatible as per the next few Best Practice items.

Best Practice 13: Clients should support a range of current and recently-superseded media types in response messages, and should always state the media types they can accept as part of the "Accept" header in requests

Best Practice 14: Services should support returning a range of current and recently-superseded media types based on the Accept header supplied by its clients, and should state the actual returned media type in the Content-Type header

Best Practice 15: Clients should always state the media type they have included within any request message in the Content-Type header

Best Practice 16: Services that do not understand the media type supplied in a client request message should return 415 Unsupported Media Type and should include an Accept header stating the types they do support.

Best Practice 17: Clients that see a 415 Unsupported Media Type response should retry their request with a range of current and recently-superseded media types with due heed to the server-supplied Accept header if one is provided, before giving up and raising an exception towards their user.

Content negotiation is the mechanism that HTTP APIs use to make backwards-incompatible media type schema changes. The new media type with the backwards-incompatible changes in its schema is requested by or supplied by new clients. The old media type continues to be requested by and supplied by old clients. It is necessary for recent media types to be supported on the client and server sides until all important corresponding implementations have upgraded to the current set of media types.

Removing Information

Removing information from media types is generally a backwards-incompatible change. It can be done with care by deprecating the information over time until no important implementations continue to depend upon the information. Often the reason for a removal is that it has been superseded by a newer form of the information elsewhere, which will have resulted in information being added in the form of a new media type that supersedes one or more existing types.

URI Space Evolution

Adding Resources or Capabilities

Adding a resource is a service-specific thing to do. No longer are we dealing with a generic method or media type, but a specific URL with specific semantics when used with the various generic methods. Some people think of the URI space being something that is defined in a tree that is separate to the semantics of operations upon those resources. I tend to take a very server-centric view in thinking of it a service contract that looks something like:

GET /invoice/{invoice-id}, returns application/invoice+xml, Return the invoice denoted by invoice-id
GET /invoice/{invoice-id}/paid, returns text/plain (xsd:bool syntax), Return the invoice paid status for the invoice denoted by invoice-id
PUT /invoice/{invoice-id}/paid, accepts text/plain (xsd:bool syntax), Set the invoice paid status for the invoice denoted by invoice-id

Adding new URIs (or more generally, URI Templates) to a service, or adding new methods to be supported for an existing URI do not introduce any compatibility issues. This is because each service is free to structure it resource identifiers in any way it sees fit, so long as clients don't start embedding (too many) URI templates into their logic. Instead, they should use hyperlinks to feel their way around a particular service's URI space wherever possible.

However, this can still become a compatibility issue between instances of a service. If it takes 30 minutes to deploy the update to all servers worldwide then there may well be client out there that are flip flopping between an upgraded server and an old server from one request to the next. This could lead to the client directed to use the new resources, but having their request end up at a server that does not support the new request yet. The best way to deal with this is likely to be to split the client population between new users and old users, and migrate them incrementally from one pool to the next as more servers are upgraded and can cope with new increased new client pool membership. This can be done with specialised proxies or load balancers in front of the main application servers and can be signalled in a number of ways, such as by returning a cookie that indicates which pool the client is currently a member of. Each new request will continue to state which pool the client is a member of, allowing it to be pinned to the upgraded set of servers. Alternatively, the transition could be made based on other identifying data such as ranges of client IP addresses.

Best Practice 18: Clients should support cookies, or a similar mechanism

Best Practice 19: Services should keep track of whether a client should be pinned to old servers or new servers during an upgrade using cookies, or a similar mechanism

Replacing Resources or Capabilities

Often as a URI space grows to meet changing demands, it will need to be substantially redesigned. When this occurs we will want to tear up the old URLs and cleanly lay down the new ones. However, we're still stuck with those old clients bugging us to deal with their requests. We still have to support them or automatically migrate them. The most straightforward way to do this is with redirection.

Best Practice 20: Clients should follow redirection status responses from the server, even when they are not in response HEAD or GET requests

Best Practice 21: When redesigning a URL space, ensure that new URLs exist that have the same semantics as old URLs, and redirect from old to new.

RFC2616 has some unfortunate wording that says clients MUST NOT follow redirection responses unless the request was HEAD or GET. This is harmful and wrong. If the server redirects to a URL that doesn't have the same semantics as the old URL then you have the right to bash their door in an demand an apology, but this redirection feature is the only feature that exists for automated clients to continue working across reorganisations of the URI space. It it madness for the standard to try and step in and stop such a useful feature from working.

By supporting all of the 2616 redirection codes, clients ensure that they offer the server full support in migrating from old to new URI spaces.

Conclusion

I have outlined some of the key best practice items for dealing with API changes in a forwards-compatible and backwards-compatible way for methods, media types, and specific service contracts. I have not covered the actual content of these protocol elements, which depend on other abstraction principles to minimise coupling and avoid the need for interface change. If there is anything you feel I have missed at this technical level, please leave a comment. At some stage I'll probably get around to including any glaring omissions into the main article text.

Thanks guys!

Benjamin

in links: google google blogsearch technorati delicious
[/general] permanent link

Sat, 2011-Sep-24

Systems Design - Requirements as Design

When designing a system as a systems engineer, we are not drawing diagrams for parts or identifying software classes to implement. We are typically operating at a level that includes at least a composition of software and hardware components. Requirements are the problem space for many engineers. From these requirements we synthesise a design that is appropriate for the engineering discipline we work within. For systems engineers involved in producing a design from requirements the solution space is often remarkably similar to the problem space: It is formalised as a set of requirements specifications.

Requirements Analysis, Logical Design, and Physical Design

When beginning to develop a system we go through the following processes:

Requirements Analysis,
Logical Design, and
Physical Design

Requirements Analysis is the process of establishing a system requirements baseline. This process is worthy of its own discussion and can generally be separated from the system design processes. I include it for completeness because it is something that the systems engineering team for a particular system or subsystem will include in their list of responsibilities.

After the team has established requirements baseline (sometimes known as the Functional Baseline) the real design processes begin. Different systems engineering standards and manuals will draw this differently. I'll lay out what I was taught, what I practice, and what I believe to be the most effective approach:

Come up with a physical design of the system (this is referred to in the diagram above as synthesis)
Come up with a logical design of the system (this is referred to in the diagram above as functional analysis)
Perform trade studies, analyses, and other activities

The nature of the "physical" design of the system depends on what system is being designed and at what level. The objective is to build up a model of "things", and the connections between these things. Generally we talk about the "things" of a system as being subsystems. Some examples:

The subsystems of an aeroplane might include airframe, engines, fuel delivery, navigation, and so on and so forth
The subsystems of an enterprise IT system might include various business applications, services, a virtualised networking and virtual machine hosting infrastructure, and again so on and so forth

The identification of subsystems is important, and is not arbitrary. The normal kinds of engineering principles apply such as reuse, cohesion, coupling, information hiding, etc. The physical design usually cannot be derived purely from systems engineering practice or training. It is necessary to understand the type of system being built, and preferably to have some experience as to how the teams that will be responsible for delivering each subsystem are likely to relate to each other and how well the physical design facilitates those interactions.

The connections between subsystems are also critical to identify at this juncture. Firstly, they can be an effective measure of how effective the physical design is minimising coupling and maximising cohesion. Secondly, like the subsystems themselves these interfaces are not going to come out fully formed based on the design of the systems team. They will need to be refined and designed by the teams responsible for each side of the interface. Identifying them now and determining what they need to achieve will ensure that the design of interfaces (like the design of the subsystems themselves) is linked to actual customer need to avoid both gold plating and underengineering outcomes.

As well as connections between the subsystems, the system will have some interfaces that need to be allocated down to subsystems. These external system interfaces will each either need to be allocated down to a single subsystem to complete design of and implement or will need to be split into multiple interfaces and allocated down to multiple subsystems.

Once the physical design of subsystems and subsystem interfaces has reached at least a first sketch stage, the logical design will need to kick in. This process is straight from the systems engineering textbooks:

Choose a system requirement (starting with the most architecturally significant ones)
If this system requirement is met by only one subsystem: Copy the requirement down as a requirement on that subsystem
If this system requirement is met by a combination of subsystems: Come up with a design as to how the requirement is met by the subsystems. Common designs include:
- The requirement is met independently by each subsystem: Copy the requirement down as a requirement on each subsystem
- The requirement is met independently by each subsystem but they consume some kind of shared resource (cost, weight, tolerance, memory usage, CPU consumption, network bandwidth consumption, etc): Define how much of the resource each one gets. Copy the requirement down as a requirement on each subsystem with their allocation in place as part of the requirement. Note that some of the budget may be set aside for contingency and released incrementally over the course of the development.
- The requirement is met by one subsystem providing input to another subsystem: Allocate a new requirement to each subsystem that defines their particular role in the interaction. If no physical interface exists between the subsystems, create one. Add a requirement on the interface to communicate the input from source subsystem to destination subsystem.

Requirements may not be the only element of the logical design. The design may include information models, models of processes that need to be performed, and a variety of other features. In the end, however, requirements allocated to subsystems and subsystem interfaces are the primary output of systems design. These are assembled into the allocated baseline for handover to the engineering teams for each of the subsystems. The engineering teams themselves may be systems engineers who will perform further requirements analysis and systems design, or it may be a domain-specific engineering team who will take the design straight from these subsystem requirements to specific drawings or code.

It may seem strange to operate as a team for whom requirements are both the problem space and the solution space of the engineering effort. I know back in my early days as a software engineer we had great insulated arguments about avoiding including design detail in requirements, and of course that's true: We don't want to include design detail of a particular system in that system's requirements. That will often add risk to the project by limiting the design choices that can be made. What we do want to do is specify requirements on subsystems that reflect their role in an overall system requirement, in compliance with the system design.

To a systems engineer requirements are not simply a vehicle for stating what a particular system or subsystem must do. A list of requirements is a list of design decisions that have been made about the required properties of the system or subsystem in order for it do meet the need of the customer in the context of other subsystems and other systems. Requirements are complete to the extent that the recorded set of requirements results in a low risk that a solution developed to be compliant with these requirements fails to meet the underlying customer need.

Not all properties of a subsystem need to be identified as subsystem requirements. For example, the amount of memory provided by a hardware subsystem to a particular subsystem may not be defined at this level. Instead, we might specify the maximum cost of the hardware and the software as well as the level of performance required and other related factors. In doing so we can defer some of these decisions to the technical teams that are best placed to make the decisions. These deferred decisions that have an impact between subsystems will need to be specified as part of the Interface Design Descriptions agreed between the subsystem owners in compliance with subsystem and interface requirements.

As well as dealing with the traced requirements directly, a number of analyses may throw up new requirements. For example, a functional failure analysis may question what happens when a particular kind of input across a particular subsystem interface arrives late, out of order, corrupted, or not at all. New requirements may appear in the allocated baseline out of this process, either for the source subsystem and interface to ensure that these things do not occur or for the recipient subsystem to deal with the situation if it does occur. With these new requirements also come new verification requirements as to what testing or other steps need to occur on the subsystem before it is accepted for system integration testing.

After an initial design baseline has been developed that is believed to meet the system requirements, a series of trade studies can be performed to compare the properties of the baseline design to alternatives that may exist. After a number of iterations of this process the development of the subsystems themselves can begin in earnest. It is helpful during these trade studies or later when customer requests or subsystem development activities throw up questions or problems with requirements to maintain effective traceability between system and subsystem requirements across multiple levels of systems engineering activities. This allows the impacts of requirements changes to be quickly assessed across the project as a whole in order to manage change. This change process can also be assisted by ensuring that a specific individual is responsible for each system and subsystem requirements specification. When these individuals get together to sign off on the impact of a change, they are known as a Change Control Board.

Benjamin

in links: google google blogsearch technorati delicious
[/systems] permanent link

Sat, 2011-Sep-24

An Overview of Systems Engineering

Over the last few years I have made the transition from focusing on software architecture to systems engineering. It's a field that incorporates a number of different roles, processes, and technologies. No, it's not systems administrator with "engineer" patched on the end. This field does not have much in the way of overlap with qualifications such as Microsoft Certified Systems Engineer. To avoid confusion I often talk about being an INCOSE-style systems engineer. The International Council on Systems Engineering is the peak body for this kind of work.

There are two basic ways to look at what a systems engineer does. One is is top down while the other is bottom up. The bottom up perspective is roughly

When we build complex systems we quickly reach a level where one small, well-disciplined team is not sufficient to deliver it. A systems engineering team is one that sits above several nuts and bolts delivery teams. Their job is to coordinate between the teams by:

Instructing the teams as to what they each individually will need to build

Taking input from the teams as to what is feasible, and adjusting the overarching design as needed to deliver the system as a whole

Taking product from the individual teams and assembling it into a cohesive, verified whole in line with the design and end user requirements.

The top down perspective is a little more like this

Customers need complex systems built that no one team can deliver. Someone needs to sit as the customer representative ensuring that a customer delivery focus exists at every level of the design. That means,

Having someone who can connect low level design decisions to real customer requirements and need

Being able to eliminate gold plating in excess of the user need

Ensuring that the product at the end really does meet the user need

Systems engineering works across all engineering disciplines to coordinate their activities and to align them to customer needs. It adds a technical chain of command to a large project alongside the project management chain of command that maximises efficiency and minimises risk. While the core focus of project management is on controlling scope and budget, the core focus of technical management is on controlling quality, value, and delivery efficiency. Together project and systems disciplines work to control project risk.

INCOSE defines Systems Engineering as:

an interdisciplinary approach and means to enable the realization of successful systems. It focuses on defining customer needs and required functionality early in the development cycle, documenting requirements, then proceeding with design synthesis and system validation while considering the complete problem:

Operations

Performance

Test

Manufacturing

Cost & Schedule

Training & Support

Disposal

Systems Engineering integrates all the disciplines and specialty groups into a team effort forming a structured development process that proceeds from concept to production to operation. Systems Engineering considers both the business and the technical needs of all customers with the goal of providing a quality product that meets the user needs.

Systems engineering is a recursive approach to deliver large projects that meet stakeholder needs.

Benjamin

in links: google google blogsearch technorati delicious
[/systems] permanent link

Sun, 2011-May-22

Scanning data with HTTP

As part of my series on SCADA and REST I'll talk in this article a little about doing telemetry with REST. There are a couple of approaches, and most SCADA protocols accidentally incorporate at least some elements of REST theory. I'll take a very web-focused approach and talk about how HTTP can be used directly for telemetry purposes.

HTTP is by no means designed for telemetry. Compared to Modbus, DNP, or a variety of other contenders it is bandwidth-hungry and bloated. However, as we move towards higher available bandwidth with Ethernet communications and wider networks that already incorporate HTTP for various other purposes it becomes something of a contender. It exists. It works. It has seen off every other contender that has come its way. So, why reinvent the wheel?

HTTP actually has some fairly specific benefits when it comes to SCADA and DCS. As I have already mentioned it works well with commodity network components due to its popularity on the Web and within more confined network environments. In addition to that it brings with it benefits that made it the world's favourite protocol:

It is a mature standard with little interoperabilty risk
It is text-based, easy to intercept, and easy for a human to debug. I can't tell you how useful that is on the Web, but in SCADA environments that might be deployed for upwards of 15 to 20 years it's a potential godsend. How many protocols around today would you bet money on will still be supported by tooling that far into the future?
It is able to be passed through multiple layers of middleware such as routers, security devices, inspectors and loggers, policy enforcers, and pretty much anything else you can think of. Importantly, as with many SCADA protocols it follows a regular enough structure for these devices to make sense of the messages and do the right thing with them. The operation each request is trying to perform is always in the same place in the message. The subject of the operation is always available as the URL of the request. The data being transferred itself can be also be understood and modified as needed by the device. Not only that, but unlike most SCADA protocols every firewall on earth knows how to deal with it already. It's a doddle to be able to restrict the type of operation allowed through a particular access point or the specific addresses that should be allowed.
HTTP is strightforward to secure in ways that are compatible with modern network architecture.

So how do we bridge this gap between grabbing web pages from a server to grabbing analogue and digital values from a field device? Well, I'll walk down the naive path first.

Naive HTTP-based telemetry

The simplest way to use HTTP for telemetry is to:

Identify each data item you want to scan from the device, and give each one a URL
Have the master (aka the client) periodically request each data item
Have the slave (aka the server) respond using a media type that both client and server understand

So for example, if I want to scan the state of circuit breaker from a field device I might issue the following HTTP request:

GET https://rtu20.prc/CB15 HTTP/1.1
Expect: text/plain

The response could be:

HTTP/1.1 200 OK
Content-Type: text/plain

CLOSED

... which in this case we would take to mean circuit breaker 15 closed. Now this is a solution that has required us to do a fair bit of configuration and modelling within the device itself, but that is often reasonable. An interaction that moves some of that configuration back into the master might be:

GET https://rtu20.prc/0x13,2 HTTP/1.1
Expect: application/xsd-int

The response could be:

HTTP/1.1 200 OK
Content-Type: application/xsd-int

2

This could mean, "read 2 bits from protocol address 13 hex" with a response of "bit 0 is low and bit 1 is high" resulting in the same closed status for the breaker.

HTTP is not fussy about the exact format of URLs. Whatever appears in the path component is up to the server, and ends up acting as a kind of message from the server to itself to decide what the client actually wants to do. More context or less context could be included in order to ensure that the response message is what was expected. Different devices all using HTTP could have different URL structures and so long as the master knew which URL to look up for a given piece of data would continue to interoperate correctly with the master.

Scanning a whole device

So scanning individual inputs is fine if you don't have too many. When you use pipelined HTTP requests this can be a surprisingly effective way of performing queries to an ad hoc input set. However, in SCADA we usually do know ahead of time what we want the master to scan. Therefore it makes sense to return multiple objects in one go.

This can again be achieved simply in HTTP. You need one URL for every "class" of scan you want to do, and then the master can periodically scan each class as needed to meet its requirements. For example:

GET https://rtu20.prc/class/0 HTTP/1.1
Expect: application/xsd-int+xml

The response could be:

HTTP/1.1 200 OK
Content-Type: application/xsd-int+xml

<ol>
	<li>2</li>
	<li>1</li>
	<li>0</li>
</ol>

Now, I've thrown in a bit of xml there but HTTP can handle any media type that you would like to throw at it. That includes binary types, so you could even reuse elements of existing SCADA protocols as content for these kinds of requests. That said, the use of media types for even these simple interactions is probably the key weakness of the current state of standardisation for the use of HTTP in this kind of setting. This is not really HTTP's fault, as it is designed to be able to evolve independently of the set of media types in use. See my earlier article on how the facets of a REST uniform contract are designed to fit together and evolve. However, this is where standardisation does need to come into the mix to ensure long-term interoperability of relevant solutions.

The fundamental media type question is, how best to represent the entire contents of an I/O list in a response message. Now, the usual answer on the Web is XML and I would advocate a specific XML schema with a specific media type name to allow the client to select it and know what it has when the message is returned.

In this case once the client has scanned class 0, they are likely to want to scan class 1, 2, and 3 at a more rapid rate. To avoid needing to configure all of this information into the master, the content returned on the class 0 scan could even include this information. For example the response could have been:

HTTP/1.1 200 OK
Content-Type: application/io-list+xml

<ol>
	<link rel="class1" href="https://rtu20.prc/class/1"/>
	<link rel="class2" href="https://rtu20.prc/class/2"/>
	<link rel="class3" href="https://rtu20.prc/class/3"/>
	<li>2</li>
	<li>1</li>
	<li>0</li>
</ol>

The frequency of scans could also be included in these messages. However, I am a fan of using cache control directives to determine scan rates. Here is an example of how we can do that for the class 0 scan.

HTTP/1.1 200 OK
Date: Sat, 22 May 2011 07:31:08 GMT
Cache-Control: max-age=300
Content-Type: application/io-list+xml

<ol>
	<link rel="class1" href="https://rtu20.prc/class/1"/>
	<link rel="class2" href="https://rtu20.prc/class/2"/>
	<link rel="class3" href="https://rtu20.prc/class/3"/>
	<li>2</li>
	<li>1</li>
	<li>0</li>
</ol>

This particular response would indicate that the class 0 scan does not need to be repeated for five minutes. What's more, caching proxies along the way will recognise this information and are able to return it on behalf of the field device for this duration. If the device has many different master systems scanning it then the proxy can take some of the workload off the device itself in responding to requests. Master systems can still cut through any caches along the way for a specific integrity scan by specifying "Cache-Control: no-cache".

Delta Encoding

Although using multiple scan classes can be an effective way of keeping up to date with the latest important changes to the input of a device, a more general model can be adopted. This model provides a general protocol for saying "give me the whole lot", and "now, give me what's changed".

Delta encoding can be applied to general scanning, but is particularly appropriate to sequence of event (SOE) processing. For a sequence of events we want to see all of the changes since our last scan, and usually we also want these events to be timestamped. Some gas pipeline and other intermittently-connected systems have similar requirements to dump their data out onto the network and have the server quickly come up to date, but not lose its place for subsequent data fetches. I have my own favourite model for delta encoding, and I'll use that in the examples below.

GET https://rtu20.prc/soe HTTP/1.1
Expect: application/sequence-of-events+xml

The response could be:

HTTP/1.1 200 OK
Content-Type: application/sequence-of-events+xml
Link: <https://rtu20.prc/soe?from=2011-05-22T08:00:59Z>; rel="Delta"

<soe>
	<event
		source="https://rtu20.prc/cb20"
		updated="2011-05-22T08:00:59Z"
		type="application/xsd-int+xml"
		>2</event>
</soe>

The interesting part of this response is the link to the next delta. This response indicates that the device is maintaining a circular buffer of updates, and so long as the master fetches the deltas often enough it will be able to continue scanning through the updates without loss of data. The next request and response in this sequence are:

GET https://rtu20.prc/soe?from=2011-05-22T08:00:59Z HTTP/1.1
Expect: application/sequence-of-events+xml

The response could be:

HTTP/1.1 200 OK
Content-Type: application/sequence-of-events+xml
Link: <https://rtu20.prc/soe?from=2011-05-22T08:03:00Z>; rel="Next"

<soe>
	<event
		source="https://rtu20.prc/cb20"
		updated="2011-05-22T08:02:59.98Z"
		type="application/xsd-int+xml"
		>0</event>
	<event
		source="https://rtu20.prc/cb20"
		updated="2011-05-22T08:03:00Z"
		type="application/xsd-int+xml"
		>1</event>
</soe>

The master has therefore seen the circuit breaker transition from having the closed contact indicating true, through a state where neither the closed or open contact were firing and within 20ms to a state where the open contact is list up.

Conclusion

Although a great deal of effort has been expended in trying to bring SCADA protocols up to date with TCP and the modern Internet, we would perhaps have been better off spending our time leveraging the protocol that the Web has already produced for us and concentrating our efforts on standardising the media types need to convey telemetry data in the modern networking world.

There is still plenty of time for us to make our way down this path, and many benefits in doing so. It is clearly a feasible approach comparable to those of conventional SCADA protocols and is likely to be a fundamentally better solution due primarily it its deep acceptance across a range of industries.

Benjamin

in links: google google blogsearch technorati delicious
[/industrial-rest] permanent link

Tue, 2011-May-03

The REST Constraints (A SCADA perspective)

REST is an architectural style that lays down a predefined set of design decisions to achieve desirable properties. Its most substantial application is on the Web, and is commonly confused with the architecture of the Web. The Web consists of browsers and other clients, web servers, proxies, caches, the Hypertext Transfer Protocol (HTTP), the Hypertext Markup Language (HTML), and a variety of other elements. REST is a foundational set of design decisions that co-evolved with Web architecture to both explain the Web's success and to guide its ongoing development.

Many of the constraints of REST find parallels in the SCADA world. The formal constraints are:

Client-Server

The architecture consists of clients and servers that interact with each other via defined protocol mechanisms. Clients are generally anonymous and drive the communication, while servers have well-known addresses and process each request in an agreed fashion.

This constraint is pretty ubiquitous in modern computing and is in no way specific to REST. In service-orientation the terms client and server are usually replaced with "service consumer" and "service provider". In SCADA we often use terms such as "master" and "slave".

The client-server constraint allows clients and servers to be upgraded independently over time so long as the contract remains the same, and limits coupling between client and server to the information present in the agreed message exchanges.

Stateless

Servers are stateless between requests. This means that when a client makes a request the server side is allowed to keep track of that client until it is ready to return a response. Once the response has been returned the server must be allowed to forget about the client.

The point of this constraint is to scale well up to the size of the world wide web, and to improve overall reliability. Scalability is improved because servers only need to keep track of clients they are currently handling requests for, and once they have returned the most recent response they are clean and ready to take on another request from any client. Reliability is improved because the server side only has to be available when requests are being made, and does not need to ensure continuity of client state information from one request to another across restart or failover of the server.

Stateless is a key REST constraint, but is one that needs to be considered carefully before applying it to any given architecture. In terms of data acquisition it means that every interaction has to be a polling request/response message exchange as we would see in conventional Modbus telemetry. There would be no means to provide unsolicited notifications of change between periodic scans.

The benefits of stateless on the Web are also more limited within an industrial control system environment, where are more likely to see one concurrent client for a PLC or RTU's information rather than the millions we might expect on the Web. In these settings stateless is often applied in practice for reasons of reliability and scalability. It is much easier to implement stateless communications within an remote terminal unit than it is to support complex stateful interactions.

Cache

The cache constraint is designed to counter some of the negative impact that comes about through the stateless constraint. It requires that the protocol between client and server contain explicit cacheability information either in the protocol definition or within the request/response messages themselves. It means that multiple clients or the same polling client can reuse a previous response generated by the server under some circumstances.

The importance of the cache constraint depends on the adherence of an architecture to the stateless constraint. If clients are being explicitly notified about changes to the status of field equipment then there is little need for caching. The clients will simply accept the updates as they come in and perform integrity scans at a rate they are comfortable with.

Cache is not a common feature of SCADA systems. SCADA is generally built around the sampling of input that can change at any time, or at least can change very many times per second. In this environment the use of caching doesn't make a whole lot of sense, but we still see it in places such as data concentrators. In this setting a data concentrator device scans a collection of other devices for input. A master system can then scan the concentrator for its data rather than reaching out to individual servers. Cache can have significant benefits as systems get larger and as interactions between devices becomes more complex.

Layered System

The layered constraint is where we design in all those proxies that have become so troublesome, but much of the trouble has come from SCADA protocols not adhering well to this constraint. It says that when a client talks to a server, that client should not be able to tell whether it is talking to the "real" server or only to a proxy. Likewise, a server should not be able to tell whether it is talking to a "real" client or a proxy. Clients and servers should not be able to see past the layer they are directly interacting with.

This is a constraint that explicitly sets out to do a couple of things. First of all it is intended to let proxies at important locations get involved in the communication in ways that they otherwise could not. We could have a proxy that is aggregating data together for a particular section of a factory, railway line, power distribution network, etc. It could be acting as a transparent data concentrator, the sole device that is scanning the PLCs and RTUs in that area ensuring that each one only has to deal with the demands of a single client. However, that aggregator could itself answer to HMIs and other subsystems all over the place. In a REST architecture that aggregator would be speaking the same protocol to both the PLCs and to its own clients and clients would use the same protocol address to communicate with the proxy as it would the real device. This transparency allows the communications architecture to be modified in ways that were not anticipated in early system design. Proxies can easily be picked up, duplicated, reconfigured, and reused elsewhere and do a similar job without needing someone to reimplement it from scratch and without clients needing to explicitly modify their logic to make use of it.

The second thing it sets out to do is allow proxies to better scrutinise communication that passes through them based on policies that are important to the owner of the proxy. The proxy can be part of a firewall solution that allows some communication and blocks other communication with a high degree of understanding of the content of each message. Part of the success of HTTP can be put down to the importance of the Web itself, but one view of the success of HTTP in penetrating firewalls is that it gives just the right amount of information to network owners to allow them to make effective policy decisions. If a firewall wants to wall off a specific set of addresses it can easily do so. If it want to prevent certain types of interactions then this is straightforward to achieve.

Code on demand

There are really two variants of REST architecture. One that includes the code on demand constraint, and another that does not contain this constraint. The variant of REST that uses code on demand requires that clients include a virtual machine execution environment for server-provided logic as part of processing response messages.

On the Web you can read this constraint as directives like "support javascript" and "support flash" as well as more open-ended directives such as "allow me to deploy my application to you at runtime". The constraint is intended to allow more powerful and specific interactions between users and HMIs than the server would have otherwise been able to make happen. It also allows more permanent changes to be deployed over the network, such as upgrading the HMI software to the latest version.

Code on demand arguably has a place in SCADA environments for tasks like making HMI technology more general and reusable, as well as allowing servers of every kind to create more directed user interactions such as improving support for remotely reconfiguring PLCs or remotely deploying new configuration.

Uniform Interface

Uniform Interface is the big one. That's not only because it is the key constraint that differentiates REST from other styles of architecture, but because it is the feature so similar between REST and SCADA. I covered the uniform interface constraint previously from a SCADA perspective. It is central to REST and SCADA styles of architecture, but is a significant departure from conventional software engineering. It is what makes it possible to plug PLCs and RTUs together in ways that are not possible with conventional software systems. It is the core of the integration maturity of SCADA systems and of the Web that is missing from conventional component and services software.

Benjamin

in links: google google blogsearch technorati delicious
[/industrial-rest] permanent link

Sat, 2011-Apr-23

The REST Uniform Contract

One of the key design decisions of REST is the use of a uniform contract. Coming from a SCADA background it is hard to imagine a world without a uniform contract. A uniform contract a common protocol for accessing a variety of devices, software services, or other places where I/O, logic, or data storage happens. The whole point of SCADA is acquiring data from diverse sources and sending commands and information to the same without having to build custom protocol converters for each individual one. Surprisingly, this is a blind spot to most software engineering. It's a maturity hole that normally requires every service consumer to implement specific code to talk to each service in the architecture.

SOAP and WSDL are built on this style of software architecture, where every service in the system has a unique protocol to access the capabilities of the service. There is no common protocol mechanism to go and fetch information. There is no common mechanism to store information. What commonality exists between the protocols of different services exists at a lower level. Services define a variety of read and write operations. SOAP ensures these custom operation names are encoded into XML in a consistent way that can be encapsulated for transport across a variety of network architectures, and WSDL ensures there is a common way for the service to embed this protocol information into the integrated development environments for service consumers, as well as into the service consumers themselves.

The contract mechanism simplifies the task of processing messages sent between service and consumer, but still couples service and consumer together at the network level and at the software code level so that each consumer can only work with the one service that implements the contract.

OPC-UA and OPC are built on SOAP and COM, respectively. SOAP and COM both share this low level of protocol abstraction and both OPC and OPC-UA compensate for this by defining a service contract that not only one service implements but that every OPC DA Server or related server needs to implement in order for consumers to be able to communicate with them without custom per-service message processing logic and without custom per-service protocol. For this reason they are a good case study to contrast the features of SOAP and HTTP for industrial control purposes.

HTTP is the current standard protocol for one aspect of the REST uniform contract. In fact, there are two other key aspects. A REST uniform contract is a triangle of:

A standard syntax for resource identifiers
A standard protocol ("methods") for accessing resources
A standard set of types ("media types") that can be transferred

As all SCADA systems use some form of uniform contract, it is useful to understand the key design feature of a REST uniform contract compared to a conventional SCADA contract. In a conventional bandwidth-conservative SCADA protocol it is common to define fetch, store, and control operations that are each able to handle a defined set of types. These types might include a range of integer and floating point values, bit-fields, and other types. As I look back over the protocols I have used over my career I consider that some of the protocol churn we have seen over time has been because of the range of types available. Each time we need a new type we either have to change the protocol, start using a different protocol, or start to tunnel binary data through our existing protocol in ways that are custom or special to the particular interaction needed.

REST takes a different approach where the protocol "methods" are decoupled from the set of media types. This adds a little protocol overhead where we need to insert an identifier for the media type along with every message we send, and a long one at that. Examples of media type identifiers on the Web include text/plain, text/html, image/jpeg, image/svg+xml, and application/xhtml+xml. These type names are long, and they have to be to ensure uniqueness. We wouldn't normally tolerate type identifiers of this length in bandwidth-conservative SCADA protocols, but where we can assume the use of Ethernet comms and other fast communication bearers the massive inefficiency in these identifiers can be tolerated.

The reason we would want to tolerate identifiers like this is because they allow our main protocol to be independent of the types that are transferred across it. There is no need to change protocol just because you need to send new types of information. The set of types can evolve separately to the main protocol, and experience on the Web and in SCADA environment suggests that this is an excellent property for the application protocol to have. Types of data that need to be moved around need to be changed, extended, and customised far more often than the ways that the information needs to be moved around. You can essentially think of the REST uniform interface constraint as a decision to use a SCADA-like protocol but to explicitly separate out the types of information to ensure longevity of the protocol in use.

This brings us back to OPC and OPC-UA. Although they are layered on top of COM and SOAP they bring back some of the uniform contract constraint. They allow some variation of media type through the use VARIANT to convey custom types. However, they don't go all the way. In a REST environment we would not have a special protocol for data acquisition, another for alarms and events, and another for historical data. We would be looking to define one application protocol that could be used for all of these purposes in conjunction with specific media types. Perhaps not all of the features of that protocol would be used for all of these purposes, but they would be available and consistent across the architecture.

On the Web the application protocol is HTTP. It has features to GET, and to PUT, and to do all the basic things you would expect for a master/slave protocol. It is relatively efficient, especially when compared to a solution that tunnels SOAP messages over HTTP, and then OPC messages back over the SOAP. A simpler solution would see OPC make use of HTTP directly, and tie its future evolution to that of HTTP rather than to a three layer hierarchy.

It is conceivable that HTTP would require further work or some extension before it is completely suitable for use as a SCADA protocol, and I'll put together a few observations on this front in a later post. However, if HTTP can be adopted as the foundation for future SCADA systems that have reasonable bandwidth available to them then it will result both in a system that is more efficient than something like OPC-UA but is also more at home in a world of web proxies and firewalls. HTTP is the protocol of the Web, and REST is the foundation behind HTTP. HTTP is and will remain more at home in complex internetworking environments than COM, SOAP, or any other custom contract definition mechanism. I would predict that disciplined application of the REST uniform interface constraint in conjunction with HTTP will produce a consistently better and more robust technical solution to the problems of SCADA systems.

Benjamin

in links: google google blogsearch technorati delicious
[/industrial-rest] permanent link

Wed, 2011-Apr-13

Industrial REST

REST is the foundation of the Web, and is increasingly relevant to enterprise settings. I hail from a somewhat different context of industrial control systems. I have been thinking of putting together a series of articles about REST within this kind of setting to share a bit of experience and to contrast various approaches.

REST is an architectural style that lays down a set of design constraints that are intended to create various desirable properties in an architecture. It is geared towards the architecture of the Web, but has many other applications. REST makes and excellent starting point for the development of Supervisory Control and Data Acquisition systems (SCADA).

SCADA systems are usually built around SCADA protocols such as Modbus, OLE, or DNP. Exactly what protocol is used will depend on a variety of factors such as the particular control industry we happen to be working in, preferences of particular customers, and the existing installed base.

The SCADA protocol plays the same role in a SCADA system as HTTP plays on the Web. It is pitched at about the same level, and has many similar properties. If we are to reimagine the SCADA system as a REST-compliant architecture then the SCADA protocol would be the application protocol we would have in use.

SCADA protocols have been developed over a long period of time to be typically very bandwidth-efficient and to solve specific problems well. However, we have been seeing for a long time now across our industries the transition from slow serial connections to faster ethernet. We have been seeing the transition from modem communication to broadband between distant sites. Many of the benefits of existing protocols are being eaten away as they are shoehorned into internet-based environments and are needing to respond to new security challenges and the existence of more complex intermediary components such as firewalls and proxies. We see protocols such as OPC responding by adopting SOAP over HTTP as a foundation layer and then implementing a new SCADA protocol on top of this more complex stack.

I would like to make the case for a greater understanding of REST in the industrial communications world, a new vision of how industrial communications interacts with intranet environments, and to identify some of the areas where HTTP as the main REST protocol of today is not quite up to snuff for the needs of a modern control systems world.

Benjamin

in links: google google blogsearch technorati delicious
[/industrial-rest] permanent link

Wed, 2011-Feb-09

Jargon in REST

Merriam-Webster defines jargon as the technical terminology or characteristic idiom of a special activity or group. In REST or REST-style SOA there are really two different levels where jargon appears:

Jargon (within a service inventory): Methods, patterns of client/server interaction, media types, and elements thereof that are only used by a small number of services and consumers.
Jargon (between service inventories): Methods, patterns of client/server interaction, media types, and elements thereof that are only used by a small number of service inventories.

Jargon has both positive and negative connotations. By speaking jargon between a service and its consumers the service is able to offer specific information semantics that may be needed in particular contexts. The service may be able to offer more efficient or otherwise more effective interactions between itself and its consumers. These are positive features. In contrast there is the downside of jargon: It is no longer possible to reuse or dynamically recompose service consumers with other services over time. More development effort is required to deal with custom interactions and media types. Return on investment is reduced, and the cost of keeping the lights on is increased.

Agility is one property that can both be increased and reduced through use of jargon. An agile project can quickly come along and build the features they need without propagating these new features to the whole service inventory. In the short term this increases agility. However, the failure to reuse more general vocabulary between services and consumers means that generic logic that would normally be available to support communication between services and consumers is necessariy missing. Over the long term this reduces the agility of the business in delivering new functionality.

The REST uniform interface constraint is a specific guard against jargon. It sets the benchmark high: All services must express their unique capabilities in terms of a uniform contract composed of methods, media types, and resource identifier syntax. Service contracts in REST are transformed into tuples of (resource identifier template, method, media types, and supporting documentation). Service consumers take specific steps to decouple themselves from knowledge of the individual service contracts and instead increase their coupling on the uniform contract instead.

However, a uniform contract that contains significant amounts of jargon defeats the uniform interface constraint. At one level we could suggest that the world should look just like the HTML web, where everyone uses the same media types with the same low-level semantics of "something that can be rendered for a human to understand". I would suggest that a business IT environment demands a somewhat more subtle interpretation than that.

That the set of methods and interactions used in a service inventory should be standard and widely used across that service inventory is relatively easy to argue. Each such interaction describes a way of moving information around in the inventory, and there are really not that many ways that information needs to be able to move from one place to another. Once you have covered fetch, store, and destroy you can combine these interactions with the business context embodied in a given URL to communicate most information that you would want to communicate.

The set of media types adds more of a challenge, especially in a highly automated environment. It is important for all services to exchange information in a format that preserves sufficiently-precise machine-readable semantics for its recipients to use without guessing. There are far more necessary kinds of information in the world then there are necessary ways of moving information around, so we are always going to see a need for more media types than methods when machines get involved.

The challenges for architects when dealing with jargon in their uniform contracts are to:

Ensure that the most widely used and understood media type available is used to encode a particular kind of information, at least as an alternative supported by content negotiation. This significantly reduces coupling between services within an inventory and between service inventories as they each come to increase coupling on independently-defined standards instead of their own custom jargon.
Ensure that the semantics of jargon methods and media types are no more precise than required, to maximise reusability. In particular, if the required semantics for a field are "a human can read it" then no further special schema is required. This approach significantly reduces coupling between sender and recipient because the recipient does not have to do any custom decoding and formatting of data before presenting it to the human. Changes to the type of information presented to the user can be made without modifying the recipient's logic.
Every new method, interaction, media type, link relation, or any other facet of communication begins its life as jargon. Warnings against jargon should not amount to a ban on new features of communication. When jargon is required, set about on a strategy to promote the new jargon to maximise its acceptance and use both within a service inventory and between service inventories.
Feed back experience from discovered media types into information modelling and high level service design processes to maximise alignment between required and available semantics. For example, vcard data structures can be adopted as the basis for user information within data models used by services.

Only by increasing the quality of agreements and understanding between humans can our machines come to communicate more effectively and with reduced effort. It is the task of humans to reduce the jargon that exists in our agreements, to increase our coupling to independently-defined communication facets, and to reduce our coupling to service-specific or inventory-specific facets.

Benjamin

in links: google google blogsearch technorati delicious
[/general] permanent link

Wed, 2011-Jan-12

B2B Applications for REST's Uniform Contract constraint

REST's uniform interface constraint (or uniform contract constraint) requires service capabilities be expressed in a way that is "standard" or consistent across a given context such as a service inventory. Instead of defining a service contract in terms of special purpose methods and parameter lists only understood by that particular service, we want to build up a service contract that leverages methods and media types that are abstracted away from any specific business context. REST-compliant service contracts are defined as collections of lightweight unique "resource" endpoints that express the service's unique capabilities through these uniform methods and media types.

To take a very simple example, consider how many places in your service inventory demand that a service consumer fetch or store a simple type such as an integer. Of course the business context of that interaction is critical to understanding what the request is about, but there is a portion of this interaction that can be abstracted away from a specific business context in order to loosen coupling and increase reuse. Let's say that we had a technical contract that didn't specifically say "read the value of the temperature sensor in server room A", or "getServerRoomATemperature: Temperature" but instead was more specific to the type of interaction being performed and the kind of data being exchanged. Say: "read a temperature sensor value" or "GET: Temperature".

What this would allow us to do is to have a collection of lightweight sensor services that we could read temperature from using the same uniform contrct. The specific service we decided to send our requests to would provide the business context to determine exactly which sensor we intended to read from. Moreover, new sensors could be added over time and old ones retired without changing the uniform interface. After all, that particular business context has been abstracted out of the uniform contract.

This is very much how the REST uniform contract constraint works both in theory and in practice. We end up with a uniform contract composed of three individual elements: The syntax for "resource" or lightweight service endpoint identifiers, the set of methods or types of common interactions between services and their consumers, and the set of media types or schemas that are common types or information sets that are exchanged between services and their consumers. By building up a uniform contract that focuses on the what of the interaction, free from the business context "why" we are free to reuse the interface in multiple different business contexts. This in turn allows us to reuse service consumers and middleware just as effectively as we reuse services, and to compose and recompose service compositions at runtime without modification to message processing logic and without the need for adaptor logic.

On the web we see the uniform contract constraint working clearly with various debugging and mashup tools, as well as in the browser itself. A browser is able to navigate from service to service during the course of a single user session, is able to discover and exploit these services at runtime, and is able to dynamically build and rebuild different service compositions as its user sees fit. The browser does not have to be rebuilt or redeployed when new services come along. The uniform interface's focus on what interaction needs to occur and on what kind of information needs to be transferred ensures that the services the browser visits along the way are able to be interacted with correctly with the individual URLs providing all of the business context required by the browser and service alike.

When we talk about service-orientation and service-orientated architecture, we move into a world with a different set of optimisations than that of the Web. There will clearly be cases where the uniform interface constraint significantly reduces complexity. Maybe we have a generic dashboard application. Maybe we have a generic data mining application. By interacting with different services and different capabilities using the same types of intraction and the same types of data these kinds of service consumers are significantly simplified the robustness of the architecture as a whole can improve. However, we start to run into some questions about the appicability of the constraint we we reach entity services within a true service-oriented architecture.

One of the key properites of a well-run SOA is that service logic and data housing is normalised. We usually end up with a layer of services that capture different kinds of important business entities and the operations that are legal to perform on these entities. Along with many of these entities we can expect to find special schemas or media types that correspond to them: An invoice type for an invoice service, a customer type for a customer service, a timetable type for a timetable service, etc etc.

As each normalised service introduces its own new media types, the unifrom contract constraint can quickly retreat. If we are talking about invoices then we are probably talking to the invoice service. If we are talking to the invoice service, and this is the only service that knows about invoices, then what other services are we supposed to have a uniform interace with exactly?

To me there are two basic answers to this. The first is that entity services are not the whole story. There is generally a layer of task services that sit on top of these entity services that will also need to talk about invoices and other entity types. Sharing a common interface between these task services will significantly increase the runtime flexibility of service compositions in most businesses. The second answer is that the uniform contract constraint is particularly applicable when service denormalisation does occur. This may occur within businesses through various accidents of history, but almost certainly will occur between businesses or between significant sectors of a business that operate their own independent service inventories.

Service-orientation generally ends at a service inventory boundary. Sure we have patterns like domain inventory where we all try to get together and play nicely to ensure that service consumers can be written effectively against a collection of service inventories... but ownership becomes a major major issue when you start to get different businesses or parts of a business that compete with each other at one level or another. If I am in competition with you, there is no way that your services and my services can become normalised. They will forever overlap in the functionality that we compete against each other in or with. This is where a uniform contract approach can aid service consumer development significantly, especially where elements of the uniform contract of a given service inventory are common to related inventories or comply with broader standards.

Consider the case where we want to have a service consumer do automatic restocking of parts from a range of approved suppliers. Our service consumer will certainly be easier to write and easier to deal with if the interface to supplier A's service is the same as the interface to supplier B's service. Such an interface will be free of the business context of whether we are talking to supplier A or supplier B, and instead will focus on the type of interfaction we want to have with each service and the type of information we want to exchange with the service. Moreover, once this uniform interface is in place we can add supplier C at minimal cost to us so long as they comply with the same interface.

The unifrom contract and the marketplace build each other up in a virtuous cycle, and eventually we see a tipping point as we saw on the early Web where the cost of adding support for the interface to services and to consumers falls drastically compared to the value of participating in the marketplace. The more people use a HTTP "GET" request to fetch data, the easier and more valuable it becomes to add support for that request to services and consumers. The more people use a html format to exchange human-readable data, the easier and more valuable it becomes to add support for that type of data to services and consumers. The same is true for more special-purpose media types and even for more special purpose interaction types.

At another level, consider the problem of trying to keep customer records up to date. Rather than trying to maintain our own database of customer details, what if we could fetch the data directly from a service that the customer owned or operated whenever we needed it? Again, this sort of interaction would benefit from having a uniform contract in place. Our service consumer may itself be our customer service, doing a periodic scrape of relevant data, but whatever form that consumer takes it is valuable for us to be able to grab that data over a uniform interface to avoid needing to develop special message processing logic for each customer we wanted data from. Likewise, it could become valuable enough to have one of these services that the customer would provide it for all of their suppliers. Having one interface in this case benefits the customer as well in not having to support a different interface for each of various suppliers.

The REST uniform contract constraint sets the bar of interoperability high: It sets the bar right where it is needed to select which service to interact with at runtime based on the appropriate business context. This is the right level to start to build up marketplaces for valuable services. It is also careful to separate the interaction part of the uniform contract from the media type part of the contract. This allows the separate reuse of each, and significantly increases the evolvability and backwards-compatibility of these interfaces.

While classical service-orientation at least in theory puts a limit on how valuable the REST uniform contract constraint can be, the real world's denormalised inventories and business to business scenarios put a premimum on the use the uniform contract pattern and on related patterns. In turn the uniform contract constraint puts the burden on people to come to agree on the kinds of interaction they with to support, and the kinds of information they wish to exchange so that machines are able to exchange that information without the need for excessing transformation and adaptor logic.

Benjamin

in links: google google blogsearch technorati delicious
[/general] permanent link

Tue, 2011-Jan-11

WS-REST 2011

It's on again!

Do you have something to say about the advancement of the REST architectural style?

Come and present at the WS-REST 2011 Second International Workshop on RESTful Design.

As a member of the programme committee I would like to echo the call for papers and encourage high quality submissions. Take care! If it's not REST I'll call you on it! This workshop will deal with REST design topics, the use of REST in novel ways, novel patterns for integrating REST with non-REST architectural elements, and the bridging of cultural and technical divides between REST and non-REST crowds. Don't be put off by the cheeky title. I assure you, it is about the advancement of the REST architectural style.

Time is running out, so get your papers in by 2011-01-31.

Here is the offical text

WS-REST 2011

March 28, 2011 - Hyderabad, India

https://ws-rest.org/2011/

Call for Papers

The Second International Workshop on RESTful Design (WS-REST 2011) aims to provide a forum for discussion and dissemination of research on the emerging resource-oriented style of Web service design. Background
Over the past few years, several discussions between advocates of the two major architectural styles for designing and implementing Web services (the RPC/ESB-oriented approach and the resource-oriented approach) have been mainly held outside of the research and academic community, within dedicated mailing lists, forums and practitioner communities. The RESTful approach to Web services has also received a significant amount of attention from industry as indicated by the numerous technical books being published on the topic.
This second edition of WS-REST, co-located with the WWW2011 conference, aims at providing an academic forum for discussing current emerging research topics centered around the application of REST, as well as advanced application scenarios for building large scale distributed systems.
In addition to presentations on novel applications of RESTful Web services technologies, the workshop program will also include discussions on the limits of the applicability of the REST architectural style, as well as recent advances in research that aim at tackling new problems that may require to extend the basic REST architectural style. The organizers are seeking novel and original, high quality paper submissions on research contributions focusing on the following topics:

Applications of the REST architectural style to novel domains

Design Patterns and Anti-Patterns for RESTful services

RESTful service composition

Inverted REST (REST for push events)

Integration of Pub/Sub with REST

Performance and QoS Evaluations of RESTful services

REST compliant transaction models

Mashups

Frameworks and toolkits for RESTful service implementations

Frameworks and toolkits for RESTful service consumption

Modeling RESTful services

Resource Design and Granularity

Evolution of RESTful services

Versioning and Extension of REST APIs

HTTP extensions and replacements

REST compliant protocols beyond HTTP

Multi-Protocol REST (REST architectures across protocols)

All workshop papers are peer-reviewed and accepted papers will be published as part of the ACM Digital Library. Two kinds of contributions are sought: short position papers (not to exceed 4 pages in ACM style format) describing particular challenges or experiences relevant to the scope of the workshop, and full research papers (not to exceed 8 pages in the ACM style format) describing novel solutions to relevant problems. Technology demonstrations are particularly welcome, and we encourage authors to focus on "lessons learned" rather than describing an implementation.
Papers must be submitted electronically in PDF format.
Easychair page: https://www.easychair.org/conferences/?conf=wsrest2011

Important Dates

Submission deadline: January 31, 2011, 23:59 local time in San Francisco, CA

Notification of acceptance: February 15, 2011

Camera-ready versions of accepted papers: February 28, 2011

WS-REST 2011 Workshop: March 28, 2011

Program Committee Chairs

Cesare Pautasso, Faculty of Informatics, USI Lugano, Switzerland

Erik Wilde, School of Information, UC Berkeley, USA

Rosa Alarcon, Computer Science Department, Pontificia Universidad de Chile, Chile

Program Committee

Jan Algermissen, Nord Software Consulting, Germany

Subbu Allamaraju, Yahoo Inc., USA

Mike Amudsen, USA

Benjamin Carlyle, Australia

Stuart Charlton, Elastra, USA

Duncan Cragg, USA

Joe Gregorio, Google, USA

Michael Hausenblas, DERI, Ireland

Ralph Johnson, University of Illinois, USA

Rohit Khare, 4K Associates, USA

Yves Lafon, W3C, USA

Frank Leymann, University of Stuttgart, Germany

Ian Robinson, Thoughtworks, UK

Stefan Tilkov, innoQ, Germany

Steve Vinoski, Verivue, USA

Jim Webber, NEO4J

Olaf Zimmermann, IBM Zurich Research Lab, Switzerland

Contact

WS-REST Web site: https://ws-rest.org/2011/

WS-REST Twitter: https://twitter.com/wsrest2011

WS-REST Email: chairs@ws-rest.org

Benjamin

in links: google google blogsearch technorati delicious
[/blipverts] permanent link

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My current feeds

My recent bookmarks

Background

Important Dates

Program Committee Chairs

Program Committee

Contact

The Moving Parts

Types of mismatches

Method Evolution

Adding Methods and Status

Removing Methods and Status

Media Type Evolution

Adding Information

Removing Information

URI Space Evolution

Adding Resources or Capabilities

Replacing Resources or Capabilities

Conclusion

Naive HTTP-based telemetry

Scanning a whole device

Delta Encoding

Conclusion