Sound advice - blog

Tales from the homeworld

My current feeds

Sun, 2011-May-22

Scanning data with HTTP

As part of my series on and I'll talk in this article a little about doing telemetry with REST. There are a couple of approaches, and most SCADA protocols accidentally incorporate at least some elements of REST theory. I'll take a very web-focused approach and talk about how can be used directly for telemetry purposes.

HTTP is by no means designed for telemetry. Compared to Modbus, DNP, or a variety of other contenders it is bandwidth-hungry and bloated. However, as we move towards higher available bandwidth with Ethernet communications and wider networks that already incorporate HTTP for various other purposes it becomes something of a contender. It exists. It works. It has seen off every other contender that has come its way. So, why reinvent the wheel?

HTTP actually has some fairly specific benefits when it comes to SCADA and DCS. As I have already mentioned it works well with commodity network components due to its popularity on the Web and within more confined network environments. In addition to that it brings with it benefits that made it the world's favourite protocol:

So how do we bridge this gap between grabbing web pages from a server to grabbing analogue and digital values from a field device? Well, I'll walk down the naive path first.

Naive HTTP-based telemetry

The simplest way to use HTTP for telemetry is to:

So for example, if I want to scan the state of circuit breaker from a field device I might issue the following HTTP request:

GET https://rtu20.prc/CB15 HTTP/1.1
Expect: text/plain

The response could be:

HTTP/1.1 200 OK
Content-Type: text/plain

CLOSED

... which in this case we would take to mean circuit breaker 15 closed. Now this is a solution that has required us to do a fair bit of configuration and modelling within the device itself, but that is often reasonable. An interaction that moves some of that configuration back into the master might be:

GET https://rtu20.prc/0x13,2 HTTP/1.1
Expect: application/xsd-int

The response could be:

HTTP/1.1 200 OK
Content-Type: application/xsd-int

2

This could mean, "read 2 bits from protocol address 13 hex" with a response of "bit 0 is low and bit 1 is high" resulting in the same closed status for the breaker.

HTTP is not fussy about the exact format of URLs. Whatever appears in the path component is up to the server, and ends up acting as a kind of message from the server to itself to decide what the client actually wants to do. More context or less context could be included in order to ensure that the response message is what was expected. Different devices all using HTTP could have different URL structures and so long as the master knew which URL to look up for a given piece of data would continue to interoperate correctly with the master.

Scanning a whole device

So scanning individual inputs is fine if you don't have too many. When you use pipelined HTTP requests this can be a surprisingly effective way of performing queries to an ad hoc input set. However, in SCADA we usually do know ahead of time what we want the master to scan. Therefore it makes sense to return multiple objects in one go.

This can again be achieved simply in HTTP. You need one URL for every "class" of scan you want to do, and then the master can periodically scan each class as needed to meet its requirements. For example:

GET https://rtu20.prc/class/0 HTTP/1.1
Expect: application/xsd-int+xml

The response could be:

HTTP/1.1 200 OK
Content-Type: application/xsd-int+xml

<ol>
	<li>2</li>
	<li>1</li>
	<li>0</li>
</ol>

Now, I've thrown in a bit of xml there but HTTP can handle any media type that you would like to throw at it. That includes binary types, so you could even reuse elements of existing SCADA protocols as content for these kinds of requests. That said, the use of media types for even these simple interactions is probably the key weakness of the current state of standardisation for the use of HTTP in this kind of setting. This is not really HTTP's fault, as it is designed to be able to evolve independently of the set of media types in use. See my earlier article on how the facets of a REST uniform contract are designed to fit together and evolve. However, this is where standardisation does need to come into the mix to ensure long-term interoperability of relevant solutions.

The fundamental media type question is, how best to represent the entire contents of an I/O list in a response message. Now, the usual answer on the Web is XML and I would advocate a specific XML schema with a specific media type name to allow the client to select it and know what it has when the message is returned.

In this case once the client has scanned class 0, they are likely to want to scan class 1, 2, and 3 at a more rapid rate. To avoid needing to configure all of this information into the master, the content returned on the class 0 scan could even include this information. For example the response could have been:

HTTP/1.1 200 OK
Content-Type: application/io-list+xml

<ol>
	<link rel="class1" href="https://rtu20.prc/class/1"/>
	<link rel="class2" href="https://rtu20.prc/class/2"/>
	<link rel="class3" href="https://rtu20.prc/class/3"/>
	<li>2</li>
	<li>1</li>
	<li>0</li>
</ol>

The frequency of scans could also be included in these messages. However, I am a fan of using cache control directives to determine scan rates. Here is an example of how we can do that for the class 0 scan.

HTTP/1.1 200 OK
Date: Sat, 22 May 2011 07:31:08 GMT
Cache-Control: max-age=300
Content-Type: application/io-list+xml

<ol>
	<link rel="class1" href="https://rtu20.prc/class/1"/>
	<link rel="class2" href="https://rtu20.prc/class/2"/>
	<link rel="class3" href="https://rtu20.prc/class/3"/>
	<li>2</li>
	<li>1</li>
	<li>0</li>
</ol>

This particular response would indicate that the class 0 scan does not need to be repeated for five minutes. What's more, caching proxies along the way will recognise this information and are able to return it on behalf of the field device for this duration. If the device has many different master systems scanning it then the proxy can take some of the workload off the device itself in responding to requests. Master systems can still cut through any caches along the way for a specific integrity scan by specifying "Cache-Control: no-cache".

Delta Encoding

Although using multiple scan classes can be an effective way of keeping up to date with the latest important changes to the input of a device, a more general model can be adopted. This model provides a general protocol for saying "give me the whole lot", and "now, give me what's changed".

Delta encoding can be applied to general scanning, but is particularly appropriate to sequence of event (SOE) processing. For a sequence of events we want to see all of the changes since our last scan, and usually we also want these events to be timestamped. Some gas pipeline and other intermittently-connected systems have similar requirements to dump their data out onto the network and have the server quickly come up to date, but not lose its place for subsequent data fetches. I have my own favourite model for delta encoding, and I'll use that in the examples below.

GET https://rtu20.prc/soe HTTP/1.1
Expect: application/sequence-of-events+xml

The response could be:

HTTP/1.1 200 OK
Content-Type: application/sequence-of-events+xml
Link: <https://rtu20.prc/soe?from=2011-05-22T08:00:59Z>; rel="Delta"

<soe>
	<event
		source="https://rtu20.prc/cb20"
		updated="2011-05-22T08:00:59Z"
		type="application/xsd-int+xml"
		>2</event>
</soe>

The interesting part of this response is the link to the next delta. This response indicates that the device is maintaining a circular buffer of updates, and so long as the master fetches the deltas often enough it will be able to continue scanning through the updates without loss of data. The next request and response in this sequence are:

GET https://rtu20.prc/soe?from=2011-05-22T08:00:59Z HTTP/1.1
Expect: application/sequence-of-events+xml

The response could be:

HTTP/1.1 200 OK
Content-Type: application/sequence-of-events+xml
Link: <https://rtu20.prc/soe?from=2011-05-22T08:03:00Z>; rel="Next"

<soe>
	<event
		source="https://rtu20.prc/cb20"
		updated="2011-05-22T08:02:59.98Z"
		type="application/xsd-int+xml"
		>0</event>
	<event
		source="https://rtu20.prc/cb20"
		updated="2011-05-22T08:03:00Z"
		type="application/xsd-int+xml"
		>1</event>
</soe>

The master has therefore seen the circuit breaker transition from having the closed contact indicating true, through a state where neither the closed or open contact were firing and within 20ms to a state where the open contact is list up.

Conclusion

Although a great deal of effort has been expended in trying to bring SCADA protocols up to date with TCP and the modern Internet, we would perhaps have been better off spending our time leveraging the protocol that the Web has already produced for us and concentrating our efforts on standardising the media types need to convey telemetry data in the modern networking world.

There is still plenty of time for us to make our way down this path, and many benefits in doing so. It is clearly a feasible approach comparable to those of conventional SCADA protocols and is likely to be a fundamentally better solution due primarily it its deep acceptance across a range of industries.

Benjamin

Tue, 2011-May-03

The REST Constraints (A SCADA perspective)

is an architectural style that lays down a predefined set of design decisions to achieve desirable properties. Its most substantial application is on the Web, and is commonly confused with the architecture of the Web. The Web consists of browsers and other clients, web servers, proxies, caches, the Hypertext Transfer Protocol (HTTP), the Hypertext Markup Language (HTML), and a variety of other elements. REST is a foundational set of design decisions that co-evolved with Web architecture to both explain the Web's success and to guide its ongoing development.

Many of the constraints of REST find parallels in the world. The formal constraints are:

Client-Server

The architecture consists of clients and servers that interact with each other via defined protocol mechanisms. Clients are generally anonymous and drive the communication, while servers have well-known addresses and process each request in an agreed fashion.

This constraint is pretty ubiquitous in modern computing and is in no way specific to REST. In the terms client and server are usually replaced with "service consumer" and "service provider". In SCADA we often use terms such as "master" and "slave".

The client-server constraint allows clients and servers to be upgraded independently over time so long as the contract remains the same, and limits coupling between client and server to the information present in the agreed message exchanges.

Stateless

Servers are stateless between requests. This means that when a client makes a request the server side is allowed to keep track of that client until it is ready to return a response. Once the response has been returned the server must be allowed to forget about the client.

The point of this constraint is to scale well up to the size of the world wide web, and to improve overall reliability. Scalability is improved because servers only need to keep track of clients they are currently handling requests for, and once they have returned the most recent response they are clean and ready to take on another request from any client. Reliability is improved because the server side only has to be available when requests are being made, and does not need to ensure continuity of client state information from one request to another across restart or failover of the server.

Stateless is a key REST constraint, but is one that needs to be considered carefully before applying it to any given architecture. In terms of data acquisition it means that every interaction has to be a polling request/response message exchange as we would see in conventional telemetry. There would be no means to provide unsolicited notifications of change between periodic scans.

The benefits of stateless on the Web are also more limited within an industrial control system environment, where are more likely to see one concurrent client for a PLC or RTU's information rather than the millions we might expect on the Web. In these settings stateless is often applied in practice for reasons of reliability and scalability. It is much easier to implement stateless communications within an remote terminal unit than it is to support complex stateful interactions.

Cache

The cache constraint is designed to counter some of the negative impact that comes about through the stateless constraint. It requires that the protocol between client and server contain explicit cacheability information either in the protocol definition or within the request/response messages themselves. It means that multiple clients or the same polling client can reuse a previous response generated by the server under some circumstances.

The importance of the cache constraint depends on the adherence of an architecture to the stateless constraint. If clients are being explicitly notified about changes to the status of field equipment then there is little need for caching. The clients will simply accept the updates as they come in and perform integrity scans at a rate they are comfortable with.

Cache is not a common feature of SCADA systems. SCADA is generally built around the sampling of input that can change at any time, or at least can change very many times per second. In this environment the use of caching doesn't make a whole lot of sense, but we still see it in places such as data concentrators. In this setting a data concentrator device scans a collection of other devices for input. A master system can then scan the concentrator for its data rather than reaching out to individual servers. Cache can have significant benefits as systems get larger and as interactions between devices becomes more complex.

Layered System

The layered constraint is where we design in all those proxies that have become so troublesome, but much of the trouble has come from SCADA protocols not adhering well to this constraint. It says that when a client talks to a server, that client should not be able to tell whether it is talking to the "real" server or only to a proxy. Likewise, a server should not be able to tell whether it is talking to a "real" client or a proxy. Clients and servers should not be able to see past the layer they are directly interacting with.

This is a constraint that explicitly sets out to do a couple of things. First of all it is intended to let proxies at important locations get involved in the communication in ways that they otherwise could not. We could have a proxy that is aggregating data together for a particular section of a factory, railway line, power distribution network, etc. It could be acting as a transparent data concentrator, the sole device that is scanning the PLCs and RTUs in that area ensuring that each one only has to deal with the demands of a single client. However, that aggregator could itself answer to HMIs and other subsystems all over the place. In a REST architecture that aggregator would be speaking the same protocol to both the PLCs and to its own clients and clients would use the same protocol address to communicate with the proxy as it would the real device. This transparency allows the communications architecture to be modified in ways that were not anticipated in early system design. Proxies can easily be picked up, duplicated, reconfigured, and reused elsewhere and do a similar job without needing someone to reimplement it from scratch and without clients needing to explicitly modify their logic to make use of it.

The second thing it sets out to do is allow proxies to better scrutinise communication that passes through them based on policies that are important to the owner of the proxy. The proxy can be part of a firewall solution that allows some communication and blocks other communication with a high degree of understanding of the content of each message. Part of the success of HTTP can be put down to the importance of the Web itself, but one view of the success of HTTP in penetrating firewalls is that it gives just the right amount of information to network owners to allow them to make effective policy decisions. If a firewall wants to wall off a specific set of addresses it can easily do so. If it want to prevent certain types of interactions then this is straightforward to achieve.

Code on demand

There are really two variants of REST architecture. One that includes the code on demand constraint, and another that does not contain this constraint. The variant of REST that uses code on demand requires that clients include a virtual machine execution environment for server-provided logic as part of processing response messages.

On the Web you can read this constraint as directives like "support javascript" and "support flash" as well as more open-ended directives such as "allow me to deploy my application to you at runtime". The constraint is intended to allow more powerful and specific interactions between users and HMIs than the server would have otherwise been able to make happen. It also allows more permanent changes to be deployed over the network, such as upgrading the HMI software to the latest version.

Code on demand arguably has a place in SCADA environments for tasks like making HMI technology more general and reusable, as well as allowing servers of every kind to create more directed user interactions such as improving support for remotely reconfiguring PLCs or remotely deploying new configuration.

Uniform Interface

Uniform Interface is the big one. That's not only because it is the key constraint that differentiates REST from other styles of architecture, but because it is the feature so similar between REST and SCADA. I covered the uniform interface constraint previously from a SCADA perspective. It is central to REST and SCADA styles of architecture, but is a significant departure from conventional software engineering. It is what makes it possible to plug PLCs and RTUs together in ways that are not possible with conventional software systems. It is the core of the integration maturity of SCADA systems and of the Web that is missing from conventional component and services software.

Benjamin