Sound advice - blog

Tales from the homeworld

My current feeds

Tue, 2007-Oct-30

XML Semantic Web

Quick link to the main RESTwiki page

I'm having a go at defining some conventions suitable for developing a . The existing "non-semantic" web is already a first stage Semantic Web. We are able to exchange semantics about document structure, and enough other basic things for humans to be able to drive it. The next stage of a Semantic Web is to produce a small set of widely-accepted schemas for conveying generally-applicable information, along the lines of what Microformats are doing in HTML. Perhaps the microformats level is a genuine next step in and of itself: People taking machines along for the ride on the Web. After that, we want to start to see machines operating more autonomously over time. I see RDF has having failed to deliver in this area, with no prospect of succeeding.

I see the failure of RDF as two-fold:

  1. The number of xml namespaces in a typical RDF document adds complexity disproportionate to its benefits, and limits independent evolution and extension of schemas
  2. The fluid document structure (especially in the XML representation) makes understanding, transformation, copy-and-paste, and a number of other beneficial activities significantly more difficult than with plain XML document types.

For some time I have felt that well-defined XML document types are superior to similarly-defined RDF schemas. I have started writing up a set of conventions for well-structured XML documents. I think these conventions yield may of the benefits that RDF is designed to bring about, but also respect the lessons learned from existing Web document types.

Have at it, let me know what you think. I'll try to get back to it soon to endorse a (very) few types that I find useful in day-to-day operations.

Benjamin

Thu, 2007-Oct-18

4+1 View Software Architecture Description

A key finding of IEEE-Std-1471-2000 "Recommended Practice for Architectural Description of Software-Intensive Systems" is that architectural descriptions should be broken into views. One approach to defining which views a typical project should use is Philippe Kruchten's Architecture.

For an approach of description that has been around for 12 years and promoted by Rational, there is a surprisingly small amount of material available on the public Internet. Most is vendor-specific, and attempts to bend the description to what can be achieved in a particular tool. I am using StarUML and attempting to apply the approach with UML diagrams to a largely RESTful architecture. This article documents my developing approach and understanding.

The example architecture I'll be describing includes a browser client that sends requests through a proxy to a server. This server itself accesses data from another server.

The Logical View

This view is the main one impacted by REST style. The diagram would be accurate but somewhat unhelpful if we were literally to describe the uniform interface as single interface class. Instead, I have been preferring to name specific URL templates as separate interfaces:

Logical View

In this view the browser attempts to access one of the http://reports.example.com/{user}/{portfolio} URLs. These URLs all support a GET operation that returns xhtml content, which the browser understands. Ultimately, the source of data is accounts.example.com. The format at this end of the architecture is application/accounts+xml, a format that the browser doesn't necessarily understand.

Moving down from the browser through the URLs it accesses we see the proxy server that initially handles the request. In order to do so it accesses the same URL, but this time directs its requests to the resource's origin server: reports.example.com.

Reports.example.com isn't the ultimate source of data in answering this request. It translates the result of a request for http://accounts.example.com/{user}/{portfolio} to accounts.example.com.

I have drawn reports.example.com and accounts.example.com separately to the web server software that supports them. The reason for this appears in the trace to the development view.

Development View

This view shows Software Configuration Items as they appear in the factory. In other words, as they appear in the development environment's Configuration Management system. Each Software Module identified in the logical view appears as a resident to one of these Configuration Items.

Configuration Items are separately-versioned entities that identify specific versions of a source tree. They also map into the deployment view: Each CI should map directly to a single installation package:

Development View

There are no dependencies shown in this example of a development view. I have been drawing dependencies in this view when the dependency is a build-time one. However, different kinds of dependencies could be added to show package dependencies. Half the fun of using tools to model these different views is to allow those tools to automatically validate consistency and tracing between the different views.

Each Software Configuration Item is deployed as part of a Hardware Configuration Item in the final system.

Physical or Deployment View

In this view we can see the Software Configuration Items deployed as part of Hardware Configuration Items. Physical connections are shown to an appropriate level, which will differ depending on whether or not the detailed hardware architecture is being maintained elsewhere.

A full trace to this view can be useful in identifying missing Software Configuration Items an Software Modules, especially those relating to configuration of specific network components.

Deployment View

An interesting feature of this physical view is that the accounts server is duplicated as a cluster. This kind of duplication is theoretically captured in the process view, but this is trickier than it may appear.

Process View

The trace through different Configuration Item views is fairly easy to capture. Configuration Items are pretty close to physical. The process view is more logical. In Philippe's original paper he shows software modules mapping onto processing threads. These threads trace to the deployment view, just as the development view does.

UML doesn't really have a way of capturing this kind of mapping. Most UML tool vendors will tell you to use sequence diagrams and the like:

View Accounts

While this helps describe a sequence of events through the logical view, it doesn't describe the redundant nature of accounts.example.com. It also fails to capture answers to a number of other questions: How do you handle flow control? How do you handle blocking, threading, connection pooling, and any number of other issues? Sometimes these spaces will be constrained by a software package you are using, or a uniform interface you are dealing with. Other times it will be important to specify these details to developers.

Scenarios or Use Case View

Scenarios capture the motivation for the architecture. In this case our motivation is pretty simple:

Scenarios View

This is the +1 view, redundant once other design decisions have been made. This might be a full use case specification, or just a bunch of bubbles. Again, this depends on whether you are maintaining a separate documentation set to cover these details or not. I haven't settled on a good way to relate this view with the other views as yet, though connecting to sequence diagrams in the process view may be a reasonable approach.

Conclusion

I think the 4+1 approach has merit, especially through the logical->development->deployment trace. However, this trace isn't unique to 4+1. It may carry its weight better if we had a better way of dealing with the process view than those provided by current tooling and theory. Including the scenarios view is an interesting approach, but normally we would want to version requirements and architecture documents separately. It might be better to leave them out of this UML model.

Benjamin