Sound advice - blog

Tales from the homeworld

My current feeds

Sat, 2006-Dec-30

The Role of Resources in REST

The architectural style is not well named, and is often not clearly understood. Of course it features of statelessness and layering that lead to high performance and scalable technical solutions, but most people can take these features for granted. Most people just want to know how to design their software according to the REST style. REST is about transferring documents between resources using a limited vocabulary of methods and a limited number of document types. Each resource demarcates a subset of an application's state, and becomes a handle by which other applications can interact with that state.

REST stands for REpresentational State Transfer. Unfortunately, this doesn't mean much to anyone not already caught up in the jargon of REST. Let me translate: REST stands for information exchange through documents. Representation is the REST term for what we commonly call a document. It encodes some useful information into a widely-understood document format such as HTML. The rest of the REST acronym is just saying that the exchange of these documents using a limited set of methods are the sole means of communciation between clients and servers in the architecture. The acronym doesn't really spell out many of the REST architectural constraints, and doesn't highlight the importance of resources in the architecture at all. Perhaps this has contributed to the popularity of the Resource Oritented Architecture TLA.

I think that the REST moniker came about at a time when developers thought they could just make distributed software like regular software with transparent remote procedure calls governed by defined files. The rise of as a notion separate to the Web Services stack seems to be helping developers see their IDLs as protocols that need careful attention as systems evolve. Developers are more comfortable with the concept that messages are the way systems interact with each other. In this environment, the focus of the REST acronym no long cuts as deeply. Now, REST's revolutionary facet is in the transition from object-oriented interface design to resource-oriented interface design.

REST's underlying assumption in this area is that it must be conceivably possible for a single entity in an internet-sized network to understand every message exchange. This means that it understands what a client is requesting every time a request is made, and understands what the server's response means every time a response is returned. It understands all other forms of message exchange also. This concept is REST's uniform interface, and much of practical REST implementation is bound up in understanding how REST allows this uniform interface to evolve and change. The uniform interface is central to REST's ability to usefully introduce intermeditaries, to provide web browsers, and in general to keep complexity at a level below the size of the network instead of greater than the size of the network. By keeping complexity low, REST aims to allow network effects to flourish rather than be cancelled out.

At first it seems like we are just talking about semantics. For instance, isn't it possible to understand every exchange over a Web-services stack? All we would have to do is understand all of the IDL files governing the system, and we would be able to understand all message exchanges.

The fault in this thinking is in assuming that the number of IDL files will grow at a rate slower than the size of the network. We may be able to define a few IDL files that are useful to a large number of users, but we will always have special needs. The popular IDL files are likely to need a number of "insert some data here" slots in order to be widely applicable, so an architectural style will need to be built up around any popular IDL to ensure that the information it allows through evolves in a way that maintains the interface's usefulness.

REST contends that one IDL file should be enough to rule them all. It contends that a family is not needed, just one definition that is allowed to evolve in a way consistent with the REST style. It says that the file should be allowed to define methods that everyone understands, and applies a consistent meaning to. It then says that most methods should have an "insert data here" slot, with a label indicating what kind of data is being transferred.

REST anchors the interface in a globally defined exchange of documents, the format of which are to be defined outside of the main interface definition. Objects that implement the generalised IDL file are called resources. It's as simple as that. These resources are typically a facade behind which more application-specific objects live. A single object may be refered to by several resources, or a particular resource operation may map to several object method invocations.

In REST, a URL is like an SQL select statement. It chooses a subset of the total application's data to operate on. Consider a URL like this: <http://example.com/SELECT%20*%20FROM%20MyTable>. A HTTP GET to that resource might retrieve the MyTable contents, and some sort of CSV document format might be a minimal mapping to retrieve data. However, the data could be returned in other formats such as HTML or something more structured such as atom. It would depend on the data being offered as to which formats the data could be encoded into, but each available format should return essentially the same data. Different document formats will exist to preserve varying levels of semantics associated with the source data.

A HTTP PUT request is where the old CRUD analogy falls down. In classic CRUD you would have a separate UPDATE statement with its own select. In REST, the same resource selects the same state. A client would send a HTTP PUT request to that same SELECT URL to replace the MyTable content with different content, or a HTTP POST request to add new records to MyTable. HTTP DELETE would purge the table.

If the resource had only selected a subset of the table, the operations would only affect the subset. If the resource had joined multiple tables, the operations would affect the data selected by the join. Analogues in the object-oriented world are similar. A resource may demarcate or select part of an object, a whole object, a set of object, or parts of a set of objects. Whichever way you slice it, the operations mean the same thing. GET me the resource's data in a particular document format. PUT this data in a compabile document format, replacing the data you had previously. POST this data in a compatible document format, adding it to the data you had previously. DELETE the data your resource demarcates.

Object-Orientation focuses on variation in method and parameter lists in exposing objects to the network. REST focuses on variation in data in exposing resources with consistent methods and document types to the network.

REST is designed to reduce the artificial complexity in communication. However there is still natural complexity to conquer. The IDL file with its limited set of methods must be dealt with cleanly, and allowed to evolve to suit different environments and requirements. There may still be areas which the uniform interface doesn't address due to performance problems, and specialised interfaces may still be required for fringe applications. Probably the biggest source of natural complexity, however, is the standardisation of document types.

Document types are closely related to ontology and to langauge. Wherever there is someone with a concept that noone else has there will be a name to match. Wherever there is a name, there is a document that contains that name. What I am talking about is the foundation of the . That foundation is formed of people, things, convention, agreement, and all of the social things that programmers have felt they could get away from in the past. The rise of internet-scale programming environments changes the playing field. We must agree more than we ever did, and yet we will still be left with local conventions for cultural and social reasons. Wherever there is a sub-group of people there will be documents specific to that group.

In this semantic web environment, REST cannot hope to achieve true uniformity. The best that can be achieved is a kind of uniformity rating. Ok, we are both using XML. That's good. We are both speaking atom. That's great. We still have conventions specific to our blogs, and no generic browser can hope to understand them... only render them. Ok, we are both using XML. We are both talking about trains. However, we can't agree on what the train's braking curve looks like.

REST will always break down at the point where local conventions cannot be overcome by global agreement. This is a natural limitation of society. In this environment the goal of REST document definition is to allow those areas which we do agree on to be communicated while local conventions can fail to be understood without confusing either side as to what is happening.

REST is an ideal that can be applied to specific problem domains in a way that reduces the overal communication complexity in a large network. The goal of REST and post-REST styles should be to ensure that agreement is forged on important things, while variation in local conventions are tolerated. Generic software should be able to understand as much as is required to make business processes work, and where local variation is essential specific software introduced. The goal is to create an environment under which technology is not what holds us back from a true read/write semantic web.

Benjamin

Sun, 2006-Nov-05

REST, SOA, and Interface Generality

Mark Baker recently wrote about about the difference between the best SOA pratices of today, and REST. I find it difficult to pin down authorative statements as to how the SOA architectural style is defined or best praticed, however Mark sees a gradual conformance to REST principles. He described the main outstanding issue as "generality". I think my considered answer would be "decoupling for evolvability".

I think the answer is a little more sophisticated and involved than "generality". As Stu Charlton notes:

Even old crusty component software had this concept (operations constrained & governed to maintain relevance across more than one agency) - Microsoft COM specified a slew of well-known interfaces in a variety of technical domains that you had to implement to be useful to others.

I made a similar note in the current :

Uniform interfaces reduce the cost of client software by ensuring it is only written once, rather than once per application it has to deal with. Both REST and RPC designs may try to maximise the uniformity of the interface they expose by conforming to industry or global standards. In the RPC model these standards are primarily in the form of standard type definitions and standard choreography. In REST it is primarily the choice of standard content types and verbs that controls uniformity.

I think the real answer is held in how the REST Triangle of nouns, verbs, and content types allows different components of a protocol to evolve separately towards uniformity. We think of HTTP as a protocol, but it clearly does not capture the whole semantics of communication over the web. The full protocol includes a naming scheme, a set of verbs, a set of content types, and a transport protocol such as HTTP.

The REST Triangle of nouns, verbs, and content types

In current REST practice the accepted naming scheme is the URL, and the accepted verbs are basically GET, PUT, POST, DELETE. The current content types include html, svg, etc. However, Fielding's disseration defines none of these things. They are meant to each evolve independently over time, never reaching a time when the evolving architecture is "done".

If we contrast the REST view to classical base-class, arbitrary method, arbitrary parameter list object-orientation we can see a few differences. I think the main difference is that a classical base-class defines both the verbs of communication and the content types in one coupled interface. The REST approach is to determine separately which verbs are needed for universal communication and which content types are needed. This decoupling allows someone needing new semantics to be transferred between machines a significant lead up to defining the whole necessary protocol:

The naming system is well-defined and well-understood. No need to do anything there. The standard verbs are sufficient to operate any any virtualised state. Easy. So the only problem left is the definition of a content type to carry the semantics as a virtualisation of the state of your objects.

In REST, the protocol for delivering HTML documents to a user is closely related to the protocol for delivering images. The different protocols for delivering images (uri+verbs+svg, uri+verbs+png) are also closely related. Retaining the meaning of a verb between different content types and retaining the meaning of a content type across different verbs is a big leg up when it comes to innovation in the communications space.

Our software cannot communicate semantics unless our programmers agree with each other. This is an intensively social process involing politics of money, power, greed, personality, and even occasionally some technical issues. However, REST is about not having to start from scratch.

REST is about starting from a shared understanding of the noun-space, a shared understanding of the verb space, and a shared understanding of the content space. Whichever one of those has to evolve to support the new communication can be targetted specifically. It is easier to agree on what a particular document format should look like within the constraints of the classical verbs. It is easier to agree on how publish-subscribe semantics should be introduced within the constraints of the classical content types.

Stu also writes:

[U]niformity, as a technological constraint, is only possible in the context of social, poltiical, and economic circumstances. It's an ideal that so far is only is achievable in technological domains. HTTP, while applicable to a broad set of use cases, does not cover a significant number of other use cases that are critical to businesses. And HTTP over-relies on POST, which really pushes operational semantics into the content type, a requirement that is not in anybody's interest. In practice, we must identify business-relevant operations, constrain and govern interfaces to the extent that's possible in the current business, industry, and social circumstances, and attempt to map them to governed operations & identifiers -- whatever identifiers & operations are "standard enough" for your intended audience and scale.

And perhaps this is a point of confusion at the borderline of REST practice. REST advocates seem almost maniacal about uniformity and the use of standard methods and content types. I have highlighted lack of standard content-type usage in otherwise very precisely RESTful designs. REST has a hard borderline defined by its architectural constraints, and anything that falls outside of that borderline can be seen in black and white as simply "unRESTful". However it would be a fallicy to suggest that all software SHOULD be RESTful.

Most software on the ground will use a fair number of non-standard content types. There are simply too many local conventions that are not shared by the whole worldwide community for every exchanged document to become a standard. Applying as many RESTful principles as makes sense for your application is the right way to do it. As Roy pointed out in a recent rest-discuss post:

All of the REST constraints exist for a reason. The first thing you should do is determine what that reason is, and then see if it still applies to the system being designed. Use the old noggin -- that is the point.

REST is very strictly an architectural style for hypermedia systems crossing the scale of the Internet both on a technical and social level that allows for the construction of generic software components, in particular the generic browser. If your application does not involve developing a browser or your architecture is smaller than the scale of the Internet there are bound to be constraints that do not apply. You will still benefit by following the constraints that it makes sense to follow.

I find that following the constraints of a controlled set of verbs and content types over a uniform interface is extremely valuable even in a small software development group. Naturally, our set of controlled content types and even verbs does not exactly match up to those common on the web. Local conventions rule. However applying REST principles does make it easier to agree on what communication between our applications should look like.

It does make it possible to write a generic data browser. As local content types became stategically important we tend to look towards the wider organisation as for standardisation, and we sometimes aspire to standardation outside of the company and corporation. You could argue that until we reach that end-point we are only being nearly-RESTful rather than RESTful, but the benefits are there all the way along the curve. Perhaps the correct interpretation of Roy's use of the word "standard" in his thesis is that the verbs and content types are understood as widely as they need to be. This scope would typically differ between the verbs and content types. Agreement on the use and semantics of GET is required by a wider audience than is agreement on the use and semantics of svg.

Benjamin

Fri, 2006-Nov-03

Introducing SENA

A am currently in the process of authoring an relating to an internet-scale mechanism. The protocol is HTTP-based, and REST-like. I am currently denoting it as the Scalable Event Notification Architecture (SENA), a play on the old (GENA). It is intended for time-sensitive applications like SCADA and the relaying of stock market quotes, rather than for general use across the Internet. The draft draft can be found here and remains for the present time a work in progress.

Feedback Welcome

I have been soliciting targetted feedback on the document for the last few weeks, with some good input from a number of sources. Please consider proving feedback yourself by emailing me at benjamincarlyle at optusnet.com.au with the subject line "HTTP Subscription".

Architecture

The architecture consists of an originating resource, a set of clients, and of intemeditaries. Clients use a new SUBSCRIBE HTTP verb to request a subscription. This may be passed through without inspection by intermediataries, in which case the server will answer the request directly. A subscription resource is created by the origin server which may be DELETEd to terminate the subscription. A notify client associated with the subscription resource sends HTTP requests back to a specified client Call-Back URL whenever the originating resource changes.

Instead of passing the requests through directly, intermediataries can participate in the subscription. They can intercept the same subscription made by several clients themselves, and subscribe only once to the originating resource. The intermediatary can specify its own notify resource which will in turn notify its clients. This has a similar scalability effect to caching proxies on the Web of today.

Notification Verbs

I currently have a number of notification request verbs listed in the draft. The simplest one that could possibly work is EXPIRE. Based on the EXPIRE semantics, a NOTIFY resource could be told that it needs to revalidate its cache entry for the originating resource. If it is a client the likely response will be to issue an immediate GET request. This request will have a max-age:0 header to ensure revalidation occurs through non-subscription-aware intermediataries.

If the notify resource is operating on behalf of an intermediatary, it may choose to fetch the data immediately given that clients are likely to ask for it again very soon. Alternatively, it may wait for the first GET request from its clients to come in. Because it has a subscrpition to the originating resource, the intermediatary can safely ignore the max-age header. This allows the intermeditary to perform one revalidation for each received EXPIRE request, regardless of the number of clients it has.

The good thing about EXPIRE is that its semantics are so weak it is almost completely unnecessary to validate that it is genunine request from the real notify source. The worst thing an attacker could do is chew up a little extra bandwidth, and that could be detected when the originating resource consistently validated the old content rather than having new content available. EXPIRE also allows all normal GET semantics to apply, including content negotiation. The main bad things about EXPIRE are that it takes an extra two network traversals to get the subscribed data after receiving the request (GET request, GET response), and that you really have to GET the whole resource. There is no means of observing just the changes to a list of 100,000 alarms.

The alternative system is a combination of NOTIFY and PATCHNOTIFY requests. These requests carry the state of the originating resource and changes to that state respectively. The big problems with these requests are in their increased semantic importance. You must be able to trust the sender of the data, which means you need digital signatures guaranteed by a shared certificate authority. This introuduces a significantly higher processing cost to communications. Useful semantics of GET such as content negotiation also disappear. I am almost resigning myself to these methods not being useful. They aren't the simplest thing that could possibly work.

Summarisation

One of the explicit features of the specification is summarisation of data. Most subscription models don't seem to have a way of dealing with notifications through intermediataries that have fixed buffer sizes to a set of clients with different connection characteristics. If an intermeditary has a buffer size of one and recieves an update that can only be delivered to one of two clients, then recieves another update... what does it do?

The intermediatary usually either has to block waiting for the slowest client or kick the slowest client off so that it can deliver messages to the faster clients. The third option is to discard old messages that have not been transmitted to the slow client. In SCADA-like situations this is an obvious winner. The newer message will have newer state, so the slow client is always getting fresh data despite not getting all the data. The fast client is not stopped from recieving messages, so gets any advantages their faster connectivity is designed to bring. Most message delivery mechanisms don't know and can't take the semantics of the message into account. SENA is explictly a state-transfer protocol, thus these semantics of "old, unimportant state" and "new, important state" can be taken into consideration. PATCHNOTIFY reqeusts can even be merged into each other to form a single coherent update.

The EXPIRE request can also be summarised. New EXPIRE requests trump old requests. There is no point in delivering multiple queued EXPIREs. Likewise, the data fetches triggered by EXPIRE requests implicitly summarise the actual sequence of state changes by virtue of multiple changes occuring between GET requests.

Keep-alive

Most subscription mechanisms include a keep-alive function. This typically exists for a number of reasons:

  1. To ensure that server-side resources are not consumed long after a client has forgotten about the subscription
  2. To allow a client to determine that the server has forgotten about the subscription and needs a reminder
  3. To allow a client to detect the death of its server

SENA addresses the first point with a long server-driven keep-alive. I have deliberately pointed to a default period consistent with the default required for TCP/IP keep-alive: Two hours. It should be long enough not to significantly increase the base-load of the Internet associated with ping data, while still allowing servers the opportunity to clean up eventually.

The second point is dealt with in SENA by a prohibition against the server-side losing subscriptions. Subscriptions should be persistent across server failures and failovers. Client-side query of subscription state is permitted via a GET to the subscription resource, however this should not be used as a regular keepalive especially over a short period. Essentially, a server should never lose a subscription and user intervention will be required whenever it does happen.

Death detection is not dealt with in SENA. It is an assumption of SENA that the cost of generic death detection outweighs the benefits. The cost of death detection is at least one message pair exchange per detection period. Over the scale of the internet that sort of base load just doesn't compute. Subscriptions should become active again after the server comes online, therefore server downtime is just another time when the basic condition of the subscription holds: That the client has the freshest possible data. Monitoring of the state of the server is left as a specialised service capability wherever it is required.

Benjamin

Sun, 2006-Oct-22

SOA does not simplify communication

I had the opportunity to attend Sun Developer Day last week, held in Brisbane Australia. The speakers were fairly good. My manager and I attended together. It was funny viewing as a RESTafarian how on one hand the speakers were talking about Web 2.0, the power of community, and how it was developers driving demand for internet services that would help Sun sell hardware and reap their profits. On the other hand they seemed to equate the success of the web with their vision of the future.

The presentation that brought this in the sharpest focus was that of Ashwin Rao, "SOA, JBI, BPEL : Strategy, Design and Best Practices". Ashwin walked us through the SOA Architectural Big rules. I made notes, crossing out the errors and replacing them with the relevant principles. I have left my notepad at work this weekend so can't bring you the exact list that Ashwin presented. In fact, it is interesting to look over the web and see how different the big rules are between presentations by the same company over the course of only a few years. I'll pick up from a similar list:

Coarse-Resource-grained Services
Summary:Services should be objects with lots of methods and represent effectively a whole application.
Discussion:Look at resources, instead. They are at the granularity they need to be based on the content types they support and the application state they demarcate. The granularity of a services doesn't matter, and can evolve. Resources, on the other hand, are the unit of communication and remain stable.
Mostly Asynchronous Interactions
Summary:Everything goes through an intermeditary that performs the actual interaction for you.
Discussion:I can see some value in this, however the complexity is significantly increased. Centering an architecture around this concept seems frought with problems of restricted evolution. Instead, this capability should be a service in its own right that can be replaced.
ConversationalStateless Services
Summary:Conversation state is maintained by coordinator
Discussion:Stateless services scale better. Conversations should always be short (one request, one response) in order to combat internet-scale latency.
ReliableObservable Messaging
Summary:You can tell your coordinator to deliver your message at most once, at least once, etc.
Discussion:This kind of reliable messaging is often not necessary. When it is necessary it is probably better to put in into the HTTP layer rather than the SOAP layer. Most HTTP requests are idempotent, so the problem is not as big as it might initially seem. There are also reasonable techniques already in place for the web to avoid duplicate submissions. At least once semantics are straightforward for a bit of code in the client to do, rather than needing to push things through a coordinator. Just keep trying until you get a result. This also allows the client more freedom as to when it might want to give up. If the client wants to exit early it could still pass this on to another service. Again, putting the message bus in the middle of the architecture seems like a mistake. It should be at the edge for both performance and evolability reasons.
Orchestrated
Summary:You can use BPEL to invoke methods, handle exceptions, etc.
Discussion:Everything is orchestrated. Whether you use BPEL or some other programming language would seem to matter little.
Registered and DiscoveredUniform
Summary:Service descriptions are available alongside services.
Discussion: In SOA you write client code every time someone develops a new WSDL file that you want use in interactions. In REST you write client code every time someone develops a new content type that you want to use in interactions. Either way, the final contract is held in code: Not in the specification. In REST we have a uniform interface. All of the methods mean essentially the same thing when applied to any resource. Resources themselves are discovered through hyperlinks. Content types are where the real specifications are needed. So far there is little evidence that specifications of this kind structured so stringently that machines can read them are of any special advantage over human-oriented text.

SOA seems to be fundamentially about a message bus. This is supposed to ease the burden of communications by moving functionality from the client into the bus. While this is not necessarily a bad thing, it does nothing to solve the two big issues: Scalability and Communication. The message bus does nothing for scalability, and real communication is just as distant in this model as in any earlier RPC-based model.

REST presents an almost completely distinct view of communication to the SOA big rules. You could use one or both or neither. They barely cross paths. REST is about solving the scalability and communications problems. Scalability is dealt with by statelessness and caching. That is one thing, but communications is where REST really makes inroads when compared to RPC.

It separates the concerns of communcation into nouns, verbs, and content. It provides a uniform namespace for all resources. It requires a limited set of verbs be used in the architecture that everyone can agree on. Finally, it requires a limited set of content types in the architecture that everyone who has a reason to understand does understand. It reduces the set of protocols on the wire, rather than providing tools and encouragement to increase the set.

Tim Bray is wrong when he talks about HTTP Verbs being a red herring. REST's separation then constraining of verbs and content types is what makes it a foray into the post-RPC world.

SOA has no equvalent concept. Instead, it concentrates on the transfer of arbitrary messages belonging to arbitrary protocols. It promotes the idea that object-orientation and RPC with their arbitrary methods on arbitrary objects with arbitrary parmeter lists are a suitable means of addressing the communications issue. It seems to accept that there will be a linear growth in the set of WSDL files in line with the size of the network, cancelling the value of participation down to at best a linear curve.

Object-Orientation is known to work within a single version of a design controlled by a single agency, but across versions and across agencies it quickly breaks down. REST addresses the fundamental question of how programmers agree and evolve their agreements over time. It breaks down the areas of disagreement, solving each one in as generic a way as possible. Each problem is solved independently of the other two. It is based around a roughly constant number of specifications compared to the size of the network, maximising the value of participation. By restricting the freedom of programmers in defining new protocols we move towards a world where communication itself is uniform and consistent.

We know this works. Despite all of the competing interests involved in building the web of HTML, it has become stronger and less contraversial over time. It has evolved to deal with the changing demands of its user base and will continue to evolve. RESTafarians predict the introduction of higher-level semantics to this world following the same principles with similarly successful results. SOA still has no internet-scale case study to work from, and I predict will continue to fail beyond the boundaries of a single agency.

Benjamin

Sat, 2006-Oct-07

RESTful Moving and Swapping

Last week I posted my responses to a number of common REST questions. Today I realised that one of my responses needs some clarification. I wrote:

On swapping: This is something of an edge case, and this sort of thing comes up less often than you think when you are designing RESTfully from the start. The canonical approach would be to include the position of the resource as part of its content. PUTting over the top of that position would move it. This is messy because it crosses between noun and content spaces. Introducing a SWAP operation is also a problem. HTTP operates on a single resource, so there is no unmunged way to issue a SWAP request. Any such SWAP request would have to assume both of the resources of the unordered list are held by the same server, or that the server of one of these resources was able to operate on the ordered list.

I think a useful data point on this question can be found in this email I sent to the rest-discuss list today:

Here, I think the answer is subtly wrong due to the question itself containing a subtle bug. The question assumes that it is meaningful to move a resource from one url to another: that the resource has one canonical name at one time and another at another time. However, cool urls don't change.

The question reveals a bug in the underlying URL-space. If it is possible for a vehicle to move from one fleet to another, then its name should not include the fleet it belongs to. The fleet it belongs to should instead be part of the content. That way, changing the fleet is the same kind of operation as changing any other attribute of the vehicle.

The long and the short of it is that anything not forming the identity of a resource should not be part of that resource's URL. The URL's structure should not be used to imply relationships between resources. That is what hyperlinking is for. Whenever you think you have to move a resource from one part of the uri-space to another, you should reconsider your uri-space. It contains a bug.

Ordered lists can exist, but these are resources in their own right and either contain urls in their representations or include the representation of the list contents. A PUT is the correct way to update such a list, replacing its content. This should not change the name of any resource. For optimisation or collision-avoidance reasons it may be appropriate to perform your PUT to a resource that represents only the subset of the list you intend to modify. Alternatively, it may also be appropriate to consider reviving the PATCH http method as a form of optimised PUT.

The fact that PATCH was effectively never used does tell us that this kind of issue comes up rarely in existing REST practice. Perhaps as REST moves beyond the simple HTML web of today the question of content restructuring will become more important. I don't know. What is important, I think, is not to forget the cool URI lesson. Don't include information in a URL no matter how structural unless you can convince yourself that changing or removing that information would cause the URL to refer to a different resource.

Benjamin

Sun, 2006-Oct-01

The Preconditions of Content Type defintion (and of the semantic web)

The crowd have an admirable mechanism for defining standards. It relies on existing innovation. New innovations and the ego that goes with it are kept to a minimum. The process essentially amounts to this:

Find the closest match, and use that as your starting point. Cover the 80% case and evolve rather than trying to dot every "i" all at once. Sort out any namespace clashes with previously-ratified formats to ensure that terms line up as much as possible and that namespaces do not have to be used. Allow extensions to occur in the wild without forcing them into their own namespaces.

I have already blogged about REST being the underlying model of the . Programs exchange data using a standard set of verbs and content types (i.e. ontologies). All program state is demarcated into resources that are represented in one of the standard forms and operated on using standard methods.

This is a new layer in modern software practice. It is the information layer. Below it is typically an object layer, then a modular or functional layer for implementation of methods within an object. The information layer is crucial because while those layers below work well within a particular product and particular version, they do not work well between versions of a particular product or between products produced by different vendors. The information layer described by the REST principles is known to scale across agency boundaries. It is known to support forwards- and backwards- compatible evolution of interaction over time and space.

I think that the the microformats model sets the basic preconditions under which standardisation of content type can be achieved, and thus the preconditions under which the semantic web can be established:

  1. There must be sufficient examples of content available, produced without reference to any standard. These must be based on application need only, and must imply a common schema.
  2. There must be sufficient examples of existing attempts to standardise within the problem space. Noone is smart enough to get it right the first time, and relying on experience with the earlier attempts is a necessary facet to getting it right next time

I think there need to be in the order of a dozen diverse examples from which an implied schema is extracted, and I think in the order of half a dozen existing formats. The source documents are likely to be extracted from thousands in order to achieve an appropriately diverse set. This means that there is a fixed minimum scale to machine-to-machine information transfer on the inter-agency Internet scale that can't be forced or worked around. Need is not sufficient to produce a viable standard.

My predictions about the semantic web:

  1. The semantic web will be about network effects relating to data which is already published with an implied schema
  2. Information that is of an obscure and ad hoc nature or structure will continue to be excluded from machine understanding
  3. The semantic web will spawn from the microformats effort rather than any -related effort.
  4. The nature of machine understanding will have to be simplified in order for the semantic web to be accepted for what it is, at least for the first twenty years or so

RDF really isn't the cornerstone of the semantic web. RDF is too closely aligned to artificial intelligence and high ideals as to how information can be reasoned with generically to be really useful as an information exchange mechanism. Machine understanding will have to be accepted as something which relies primarily on human understanding in the future. It will be more about which widget a program puts a particular data element into than what other data it can infer automatically from the information at hand. One is simply useful. The other is a dead end.

The semantic web is here today, with or without RDF. Even when simple HTML is exchanged, clients and servers understand each other's notations about paragraph marks and other information. The level of semantics that can be exchanged fundamentally rely on a critical mass of people and machines implicitly exchanging those semantics before standardisation and shared understanding begin. The microformat community is right: Chase identification, time and date, and location. Those semantics are huge and enough formats exist already to pick from. The next round of semantics may have to wait another ten or twenty years, until more examples with their implied schemas have been built up.

Benjamin

Sat, 2006-Sep-30

Publish/Subscribe and XMPP

I have a long-standing interest in protocols and technologies. In the proprietary system I work with professionally, publish/subscribe is the cornerstone of realtime data collection. Client machines are capable of displaying updates from monitored field equipment in latencies measured according the speed of light, plus a processing delays.

My implementation is proprietary, so I have long been keeping an eye out for promising standards and research that may emerge into something positive. The solution must be architecturally sound. In particular, it should be scalable to the size of the Internet. I have some thoughts about this which mainly stem back to the protocol, Rohit Khare's dissertation Extending the REpresentational State Transfer Architectural Style for Decentralized Systems and my responses to it: Consensus on the Internet Scale, The Estimated Web, Routed REST, REST Trust Relationships, Infinite Buffering, and Use of HTTP verbs in ARREST architectural style.

I like the direct client to server nature of HTTP. You figure out who to connect to using DNS, then make a direct TCP/IP connection. Or indirect. For scalability purposes you can introduce intermediataries. These intermediataries are not confused about their role. It is to direct traffic on to the origin server. Sometimes this involves additional intermediataries, however these proxies are not expected to explicitly route data. That is a job for the network.

takes an instant-messenger approach to communications. JEP-0060 specifies a publish/subscribe mechanism for the XMPP protocol that apparently is seeing use as a transport for atom to notify interested parties when news feeds are updated. I don't mind saying that the fundamental architecture irks me. Instead of talking directly to an end server or being transparently pushed through layers that improve network performance, we start out with the assumption that we are talking to a XMPP server. This server could be anywhere. Chances are that unlike your web proxy, it is not being hosted by your ISP. Instead of measuring the request in terms of the speed of light between source and destination plus processing delays, we need to consider the speed of light and processing delays across a disorganised mishmash of servers from here to Antarctica. XMPP itself also appears to be a poor match to the REST architectural style. On the face of it, XMPP appears to have confusing identifier schemes, nouns, content types, and mish-mash of associated standards and extensions that remind me more of the WS-* stack than specifications or software stacks that are still used by the generation that follows their specifiers.

Nevertheless, GENA is dead outside of UPnP. The internet drafts submitted by Microsoft to the IETF don't match up with the specification that forms part of UPnP. Neither specification matches up to GENA implementations I have seen in wild. I think that the fundamental reason for this is not that HTTP forms a poor transport for subscription at a base technological level, but that firewalls are generally set up to make requests back from HTTP servers impossible as part of a subscription mechanism. As such, a protocol that already supports bidirectional communication and is acceptable to firewalls yields a better chance of ongoing success. For the moment, it is a technology that works on the small scale and in the wild Intenet today. Perhaps from that seed the organisational issue between servers will simply work itself out as the technology and associated traffic volume becomes more substantial and more important. After all, the web itself did not start out as the well-oiled reliable and high-performance machine it is today.

So, it seems reasonable that when it comes to rolling out a standards-based subscription mechanism today that JEP-0060 should be the preferred option ahead of trying to define and promote a HTTP-based specification. That said, there are a number of principles that must be transferrable to this XMPP-based solution:

In good RESTful style, subscriptions transfer a summarised sequence of the states of a resource. The first such state is the resource's state at the time the subscription request was recieved. This allows the state of the resource to be mirrored within a client and for the client to respond to changes in the resource's state. However it is reasonable to also consider subscription to transient data that is never retained as application state in any resource. This data has a null initial state, no matter when it is subscribed to.

Working through the XMPP protocol adds a great deal of complexity to the subscription relationship. Intermediataries handle the subscription, so they must also handle authorisation and other issues normally left out of the protocol to be handled within the origin server. In XMPP, the subscription effectively becomes a channel that certain users have a voice in and that other users can recieve messages from. My expertise is very thin about XMPP, but on the face of things it appears that subscription data is routed through a server that manages the particular channel, the pubsub service. Perhaps this service could be repaced with an origin server if that was desired.

In terms of matching up with my expectations of a subscription service, well... localised resynchronisation and patch updates can both be supported, but not at the same time. The pubsub service can forward the last message to a new subscriber. If that message contains the entire state of the resource, the client is synchronised. If it is a patch update, the client cannot synchronise. There does not appear to be a way to negotiate or inform the client of the nature of the update. "Message" appears to be the only recognised semantic. This is understandable, I suppose, and fits at least a niche of what a pubsub system can be expected to do.

Summarisation seems to be on the cards only at the edge of the network (i.e. the origin server). This is probably the best place for summarisation, however the lack of differential flow control is a concern. The server appears to simply send messages to the pubsub service at the rate that service can accept them. What happens from there is not clearly cemented in my mind. Either the rate is slowest to meet the slowed client, messages are buffered infintely (until the pubsub service crashes), or messages are buffered to a set limit and messages or clients are dropped past that point. There doesn't seem to be any way of reporting flow control back to the origin server in order to shape the summarisation activity at that point. If message dropping is occuring in the pubsub service then this should be more explicit. Other forms of summarisation may be preferrable to the wholesale discard of arbitrary messages.

JEP-0060 is long (really long) and full of inane examples. It is difficult to get a feel for what problems it does and does not solve. I doesn't contain text like "flow control", "loss", "missed", "sequence", "drop"... anything recongnisable as how the subscription model relates to the underlying transport's guarantees. Every time I look through it I feel like crying. Perhaps I am just missing the point, but when it comes to internet-scale subscription I don't think this document puts a standards-based solution in play.

I need to be able synchronise the state of a resource. I need the subscription mechanism to handle exceptional load or high latency situations effectively. I need it to be able to deal with thousands of changes per second across a dispirate client base even in my small example. On the Internet I expect it to deal with millions or billions of changes per second. Will a jabber-style network handle that kind of load without breaking client service guarantees? How are overflow conditions handled? Can messages be lost, reordered, or summarised? Are messages self-descriptive enough to allow summarisation by the pubsub server?

Perhaps I should go and pen an internet draft after all. GENA isn't that far off the mark, and really does work effectively when no firewalls are in the way. Perhaps it would be a useful mechanism to reliably and safely transfer data between jabber pubsub islands.

Benjamin

Sat, 2006-Sep-30

Using TCP Keepalive for Client Failover

I covered recently my foray into using mechanisms that are as standard as possible between client and server to facilitate a fixed-period failover time. A client may have a request outstanding and may be waiting for a response. A client may have subscriptions outstanding to the server. Even a server that transfers its IP or MAC address to its backup during failover does not completely isolate its clients from the failover process. Failover and server restart both cause a loss of the state of the server's TCP/IP stack. When that happens, clients must detect it in order to successfully move their processing to the new server instance.

I had originally pooh poohed TCP/IP keepalive as a limited option. Most (all?) operating systems that support keepalive use system-wide timeout settings, so values can't be tuned based on who you are talking to. I think this might be able to be overcome by solaris zones, however. Also, the failover characteristics of a particular host with respect to the services it talks to are often similar enough that this is not a problem.

I want to keep end-to-end pinging to a minimum, so I only want keepalive to be turned on while a client has requests outstanding. An idle connection should not generate traffic. Interestingly, this seems to be possible by using the socket option. It should be possible to turn the keepalive on when a request is sent, and turn it back off again when the last outstanding response is recieved. In the mean-time the active TCP/IP connection will often be sending data, so keepalives will most often be sent during network lull times while the server is taking time processing.

If I want my four second failover, it should just be a matter of setting the appropriate kernel variables to send requests every second or so and give up after a corresponding number of failures. Combined with IP-level server failover, and subscriptions that are persistent across the failover, this provides a consistent failover experience with a minimum of network load.

Benjamin

Sat, 2006-Sep-30

Common REST Questions

I just came across a blog entry that includes a number of common misconceptions and questions about about REST, here

I posted a response in comments, but I thought I might repeat it here also:

RESTwiki contains a some useful information on how REST models things differently to Object-Orientation. See:

and others. Also, see the rest wikipedia article which sums some aspects of REST up nicely:

The core of prevailing REST philosophy is the rest triangle, where naming of resource is separated from the set of operations that can be performed on resources, and again from the kinds of information representations at those resources. Verbs and content types must be standard if messages are to be self-descripitve, and the requirements of the REST style met. Also, there should be no crossover between the corners of the REST triangle. names should not be found in verbs or content types, except as hyperlinks. Content should not be found in names or verbs. Verbs should not be found in names or content.

REST can be seen a documented-oriented subset of Object-Orientation. It deliberately reduces the expressiveness of Objects down to the capabilities of resources to ensure compatability and interoperability between components of the architecture. Object-Orientation allows too great a scope of variation for internet-scale software systems such as the world-wide-web to develop, and doesn't evolve well as demands on the feature set change. REST is Object-Orientation that works between agencies, between opposing interests. For that you need to make compromises rather than doing things your own way.

Now, to address your example:
Verbs should not be part of the noun-space, so your urls

should not be things you POST to. They should demarcate the "void" state and the "reverse" state of your journal entry. When you GET the void URL it should return the text/plain "true" if the transaction is void and "false" if the transaction is not void. A put of the text/plain "true" will void the transaction, possibling impacting the state demarcated by other resources. Reverse is similar. The URL should be "reversal" rather than "reverse". It should return the url of the reversing transaction, or indicate 404 Not Found to show no reversal. A PUT to the reverse would return 201 Created and further GETs would show the reversal transaction.

Creation in REST is simple. Either the client knows the URL of the resource they want to create and PUT the resource's state to that URL, or the client requests a factory resource add the state it provides to itself. This is designed to either append the state provided or create a new resource to demarcate the new state. POST is more common. The PUT approach requires clients to know something about the namespace that they often shouldn't know outside of some kind of test environment.

On swapping: This is something of an edge case, and this sort of thing comes up less often than you think when you are designing RESTfully from the start. The canonical approach would be to include the position of the resource as part of its content. PUTting over the top of that position would move it. This is messy because it crosses between noun and content spaces. Introducing a SWAP operation is also a problem. HTTP operates on a single resource, so there is no unmunged way to issue a SWAP request. Any such SWAP request would have to assume both of the resources of the unordered list are held by the same server, or that the server of one of these resources was able to operate on the ordered list.

On transactions: The CRUD verb analogy is something of a bane for REST. I prefer cut-and-paste. Interestingly, cut-and-paste on the desktop is quite RESTful. A small number of verbs are able to transfer information in a small number of widely-understood formats from one application to another. The cursor identifies and demarcates the information that will be COPIED (GET) or CUT (GET + DELETE) and the position where the information or state will be PASTED to (PUT to paste over, POST to paste after). The CRUD analogy leaves us wondering how to do transactions, but with the cut-and-paste analogy the answer is obvious: Don't.

In REST, updates are almost universally atomic. You do everything you need to do atomically in a single request, rather than trying to spread it out over several requests and having to add transaction semantics. If you can't see how to do without transactions you are probably applying REST at a lower-level than it is typically applied. In this example, whenever you post a new journal entry you do so as a single operation. POST to a complete representation of the journal entry to a factory resource.

That is not to say that REST can't do transactions. Just POST to a transaction factory resource, perform several POSTS to the transaction that was created, then DELETE (roll-back) or POST a commit marker to the transaction.

How REST maps to objects is up to the implementation. You can evolve your objects independently of the namespace, which is expected to remain stable forever once clients start to use it. The URI space is not a map of your objects, it is a virtual view of the state of your application. Resources are not required or even expected to map directly onto objects. One method of a resource may operate on one object but another may operate on a different object. This is especially the case when state is being created or destroyed.

REST is about modelling the state of your application as resources, then operating on that virtualised state using state transfer methods rather than arbitrary methods with arbitrary parameter lists. REST advocates such as myself will claim this has significant benefits, but I'll refer you to the literature (especially the wikipedia page) rather than list them here.

Benjamin

http://soundadvice.id.au/blog/

Mon, 2006-Sep-18

High Availability at the Network Level

I have been reading up over the last week on an area of my knowledge that is sorely lacking. Despite being deeply involved in the architecture of a high availability distributed software architecture, I don't have a good understanding of how high availabilty can be provided at the network level. Given the cost of these solutions in the past and the relatively small scale of the systems I have worked with, we have generally considered the network to be a dumb transport. Deciding which network interface of which server to talk to has been a problem for the application layer. A typical allowable failover time in this architecture would be four seconds (4s). This is achieved with end to end pinging between the servers to ensure they are all up, and from clients to their selected server to ensure it is still handling their requests and subscriptions.

The book I have been reading is a Cisco title, Building Resilient IP Networks It has improved my understanding of layer 2 switching, an area which has changed significantly since I last really looked at networking. Back then, the hub was still king. It also delved into features for supporting host redundancy including NIC teaming, clustering and the combination of Server Load Balancing and a Global Site Selector.

It may be just my lack of imagination, but the books seems to get tantalisingly close to a complete solution for the kinds of systems I build. Just not quite there. It talks about clustering and NIC teaming within a single access module, and that offers at least a half-way solution. It seems you could add a VLAN out to another site (thus another access module) for disaster recovery, but without offering a clear alternative the book repeatedly warns against such an architecture.

So, I have three servers. Two are at my main site. One is at my disaster recovery site. I can issue pings using layer three protocols, so I don't strictly need my servers to be on the same subnet. However, I need my clients to fail over from one to the other within a fixed period after any single point failure. It looks like I need IP address takeover between the sites to solve my failover problem at the network level.

The DNS-based Global Site Selector option discussed in the book is fine if we want the failover to affect only new clients. Old clients will retain cached DNS records, and may not issue another DNS query for requests that are still pending. Issuing a mass DNS cache expiry multicast or using very short DNS cache periods both seem like poor options. Ideally we would contain the failover event within the cluster and its immediate network somehow.

A routing-based failover solution might allow a floating IP address to be taken over a different node within the cluster as failover occurs. For this to occur we would need a fast-converging OSPF network that allowed a single IP to be served from multiple sites. Failover of connections would be handled at the OSPF level. This solution (if implementable) would have similar charactersitics to any multiple-site VLAN solution based on RSTP. The problem remains in either case of clients that are already in particular communication states with the failed server.

A current client may be either part-way through issuing a request, or may be holding a subscription to resources at a server. If the client is to reissue its request to the new server after a failover, the client can wait only as long as the failover time before declaring the original request failed and in an unknown state of completion. The maximum request processing time on the server is therefore bounded by the failover time, less the network latency to the client with the particular failover time.

An alternative to timing out when the failover time is reached would be to sample the state of the connection or request at a rate faster than that of the failover time. If your failover time is four seconds (4s), you could sample the state every three seconds, or two, or one. A timeout would not be necessary if the sampling indicated that the request was still being processed.

The sampling itself could come in the form of a pipelined "PING" request queued up behind the main request. Whenever the transmit queue is nonempty on the client side, TCP will transmit packets on an expontential back-off strategy. So long as routes to the new server of the cluster IP address are established before too many packets are lost, the new server should respond indicating that it doesn't know about the connection. Another option would be to employ a dedicated request-state sampling protocol or to craft specially-designed TCP packets for transmission to sample the state.

Subscription is a problem in it's own right. The server has agreed to report changes to a particular input as they come in, however the server may fail. The client must therefore sample the state of its subscriptions at a rate faster than the failover time if it is to detect failure, and issue requests to the new server to reestablish the subscription. This again is an intensive process that we would rather do without. One solution is to persist subscription state across failover. Clients should not recieve an acknowledgement to their subscription requests until news of the subscription request has spread sufficiently far and wide.

Both the outstanding request and outstanding subscription client states can be resolved through these mechanisms when the server is behaving itself. However, there is the possibility that an outstanding request will never return. Likewise there is the possibility that a client's subscription state will be lost. For this reason, outstanding requests do demand an eventual timeout. Accordingly, outstanding subscriptions do need to be periodically sampled and renewed. These periods can be much longer than the failover period.

Clustering and network-based solutions can be expensive, but they can also provide scalable failover solutions for the service of new clients. Existing clients still need some belts and straps to ensure they fail over safely to the new server.

My high-availability operating-system support wishlist:

Benjamin

Sat, 2006-Sep-02

REST Triangle, URLConstruction, RESTfulDesign

I have put a few more draft documents up on restwiki:

The article names perhaps don't quite do them justice. The REST Triangle article is about how REST decouples various problem domains from each other to be solved separately. It lays out what those problem domains are, what the purpose of each problem domain is, and why you should be avoid crossover between the problem domains. URL Construction is something of a splinter discussion about why URLs shouldn't be used to convey information to clients, and why clients shouldn't construct URLs.

RESTful design is my summary of how to design a RESTful interface. It is based on Object-Oriented design, at least until you get past the resource definition into the definition of hyperlinks and content schemas. I think the information there is good, although I haven't included a specific case study as yet. Also, restwiki doesn't look like it supports images. It is a bit hard to convey some of the diagramming that should go on in this kind of design without image support.

All documents are subject to future change, and your mileage may vary. Good luck, and I hope they mean something to you.

Benjamin

Sun, 2006-Aug-20

Minimum Methods

I'm working to increase the amount of good information available for REST proponents to point to and use, and for those new to REST to help them understand. There is a lot of half-way information on REST. Much of this information ignores the REST that Fielding describes, and replaces it with one that the author considers workable from their particular persuasion. I'm more of an idealist. I figure that when you are breaking REST tenets, you should at least know how far you are from these tenets.

As such, I have written up a new RESTwiki article as part of my tutorial series: MinimumMethods. This covers the minimal and RESTful use of the classic four REST verbs. I base my analogies around cut-and-paste verbs, rather than CRUD verbs, and I even have ascii art to demonstrate flow of state.

My intention is to round out this series of articles with one solely on RESTful design, but that probably won't happen this weekend

Benjamin

Sun, 2006-Aug-20

Which Content Type

In my continuing effort to communicate what REST is about, I have written my first draft of the restwiki article, WhichContentType. Based on the lessons of my REST Tutorial, this is step number four: Getting content types is the last stage of getting REST right.

It is interesting to view REST through the list of lessons, because it soon becomes apparent that REST is a way of trying to run out of things to disagree on. In protocol development you must agree on a lot before you can exchange useful data. REST says that before you develop any new interface between two pieces of software you should already have:

  1. A uniform identifier scheme,
  2. A uniform abstraction layer that your software will use to access the interface, and
  3. A uniform understanding of a limited set of methods

Combined with a good basic REST protocol such as HTTP, you end up at place where everything but the set of urls and content types can be defined solely on generic technical merits.

The REST approach to content types is that there should be a constrained set of those, too. Every important content type you have should either be a standard, or you should be in the process of developing it into a standard. When enough standard types exist to do 'most everything that you would like to do, your only point of potential disagreement becomes the set of urls your provide. The structure of URLs don't matter, so long as they are stable. That means that we have nothing left to disagree on!

REST pushes all of the social aspects of protocol development into the content-type space, and then challenges you to solve the content-type problem once and for all for your industry or your application type. REST claims that it is easier to come to a social agreement about content types (especially text-based content-types), because they are so easy to extend and ignore extensions on. Consumers just look for what they understand. Producers insert as much information as they think will be understoo. Meeting at that functionality sweet spot in the middle is the social problem you are solving.

Benjamin

Sun, 2006-Aug-13

REST Tutorial

I have written up my first draft of a REST Tutorial.

This provides a semi-practical description of how you can get to a non-RESTful distributed object environment to a RESTful one in four easy or not so easy lessons. It is based on my experiences so far doing this for my employer. It isn't about the politics of the migration, that is a whole 'nother issue. It isn't about the specific technical issues, like how you would set up a web server or what software to use. It is about how to set up uniform identifiers, a uniform resource abstraction, standard methods, and standard content types.

I have a little blurb about how I have been doing REST design, but I haven't added diagrams as of yet. The article should be considered live, and I intend to adjust and expand it from time to time as feedback emerges.

Benjamin

Sat, 2006-Aug-12

Experiments with DNS SRV records

I thought I would dedicate a portion of my weekend to testing the capabilities that DNS SRV records could provide for IPC systems. I am running Debian Linux with version 8.4.6 of bind.

Step 1: Install BIND

# apt-get install dhcp

Easy enough

Step 2: Manaully add a SRV record

# vi /etc/bind/db.local
;
; BIND data file for local loopback interface
;
$TTL    604800
@       IN      SOA     localhost. root.localhost. (
                              1         ; Serial
                         604800         ; Refresh
                          86400         ; Retry
                        2419200         ; Expire
                         604800 )       ; Negative Cache TTL
;
@       IN      NS      localhost.
@       IN      A       127.0.0.1
_http._tcp.fuzzy.localhost. IN  SRV     10 0 8080 localhost.

That gives me a http SRV record with the domain name "fuzzy.localhost.". It points to 127.0.0.1:8080. I can now run my own little web server at that location, and it won't conflict with the web server running at port 80, nor with any other user's web server. Theoretically, that means I can run any number of little http-speaking applications on behalf of any number of users on this machine.

There are two problems with this. The first is that you want to dynamically assign ports, rather than manage them centrally. The second is that the use of SRV records has not been defined for http, though attempts have been made to do so. Firefox does not currently support SRV records. Someone will have to work on that :)

Hopefully the dynamic assignment issue can be sorted out, though.

Step 3: Dynamic assignment

I used Painless DDNS as my guide, but had to do a few vesion-specific tweaks

# vi /etc/bind/named.conf.local
//
// Add local zone definitions here.

include "/etc/bind/keys.conf";

zone "fuzzy.localhost" {
        type master;
        file "/etc/bind/db.local.fuzzy";
        allow-update {
                key fuzzy.localhost.;
        };
};

$ dnskeygen -H 512 -u -n fuzzy.localhost.
# vi /etc/bind/keys.conf
key fuzzy.localhost. {
        algorithm HMAC-MD5;
        secret "svi6dhhSrwpcsfTivW67ruC9itm3DeGutpp0uNj1HTJGHVWl/Y/BUqwVEM0NE/S2gq8DENAXFaT7RSh3D4Fvxg==";
}
# vi /etc/bind/db.local.fuzzy
;
; BIND data file for user fuzzy on localhost
;
$TTL    604800
@       IN      SOA     fuzzy.localhost. fuzzy.localhost. (
                              1         ; Serial
                         604800         ; Refresh
                          86400         ; Retry
                        2419200         ; Expire
                         604800 )       ; Negative Cache TTL
;
@       IN      NS      fuzzy.localhost.
@       IN      A       127.0.0.1

Now the server is ready to go. We have set up a single user who can assign services to their sub-domain of localhost. In a real RPC setup we would probably have this done automatically or implicitly for the set of users that should be permitted to offer services to themselves, to the machine, and to the world.

The last step is to actually perform the updates:

$ nsupdate -k Kfuzzy.localhost.+157+00000.private
> server localhost
> zone fuzzy.localhost
> update add _http._tcp.fuzzy.localhost. 86400 SRV     10 0 8080 fuzzy.localhost.
> show
Outgoing update query:
;; ->>HEADER<<- opcode: UPDATE, status: NOERROR, id:      0
;; flags: ; ZONE: 0, PREREQ: 0, UPDATE: 0, ADDITIONAL: 0
;; UPDATE SECTION:
_http._tcp.fuzzy.localhost. 86400 IN    SRV     10 0 8080 fuzzy.localhost.
> send
> ^D

And to prove it works:

$ dig @localhost _http._tcp.fuzzy.localhost -t srv
; <<>> DiG 9.3.2 <<>> @localhost _http._tcp.fuzzy.localhost -t srv
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22814
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;; QUESTION SECTION:
;_http._tcp.fuzzy.localhost.    IN      SRV

;; ANSWER SECTION:
_http._tcp.fuzzy.localhost. 86400 IN    SRV     10 0 8080 fuzzy.localhost.

;; AUTHORITY SECTION:
fuzzy.localhost.        604800  IN      NS      fuzzy.localhost.

;; ADDITIONAL SECTION:
fuzzy.localhost.        604800  IN      A       127.0.0.1

;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sat Aug 12 09:44:46 2006
;; MSG SIZE  rcvd: 109

Benjamin

Sat, 2006-Jul-22

Defining Object-Orientation (and REST)

Andrae muses about what object-oriented programming is, and comes to a language theorist's conclusion:

Object Oriented Programming is any programming based on a combination of subtype polymorphism and open recursion

I'll take a RESTafarian stab it it:

Object-Oriented Programming divides application state into objects. Each object understands a set of functions and corresponding parameter lists (its interface type) that can be used to access and manipulate the subset of application state it selects.

Objects with similar functions can be accessed without knowing the precise object type, either through knowledge of an inherited interface type or by direct sampling of the set of functions the object understands.

Here is my corresponding definition of REST Programming

REST Programming divides application state into resources. Each resource understands a set of representations of the state it selects, and a standard set of methods that can be used to access and manipulate its state using those representations. The representation types themselves are selected from a constrained set.

All resources have similar functions, and in pure REST all have similar representation types (content types). This means that all resources can be accessed without knowing the precise resource type.

Object-Orientation defines different types for different objects, and must then consider mechanisms such as introspection to discover type information and interact with an unknown object. REST defines one type for all objects. The one type is used regardless of application, industry or industry sector, or other differentiating factor. The goal of the uniform interface is that no client-side code needs to be written to support any particular new application or application type. They are accessed through the existing uniform interface using existing tools.

In this pure model, a single browser program can access banking sites or sports results. It can access search engines, or browse message group archives. The principle is that information is pushed around in forms that everyone understands. No new methods are required to access or manipulate the information. No new content types are introduced to deal with new data.

Benjamin

Sun, 2006-Jul-09

URI Reference Templating

I think Mark has got it partly wrong. I think that his Link-Template header needs to be collapsed back into the HTTP Link header, and I think that his URI templating should be collapsed back into the URI Reference. Let me explain:

It is rarely OK for a client to infer the existence of one resource from the fact that another exists. A client should only look up resources it is configured with, resources it has seen hyperlinks to, and resources that have come out of some form of server-provided forms or templating facility. At the moment, our templating facilities centre around HTML forms and XForms. Is this in need of some tweaking?

We already have a separation between a URI and a URI reference. A URI is what you send over the wire to your HTTP server. A URI reference tells you how to construct the URI, and what to do with the result returned from the server. Consider the URI reference <http://example.com/mypage#heading1>. This tells us to construct and send a request with url <http://example.com/mypage> to the server, and look for the "heading1" tag in the returned document. The exact processing of the "heading1" fragment depends on the kind of data returned from the server, and the kind of application the client is. A browser will show the whole document, but navigate to the identified tag. Another client might snip out the "heading1" tag and its descendents for further processing.

Mark Nottingham has proposed a reintroduction of and enhancement to HTTP's link header. In his draft, he suggests a templating mechanism for relationships between documents. He proposes a Link header for URI referenes, and a separate Link-Template header that allows for templating. He defers definition of how the template is expanded out to the definition of particular link types.

Danny is unsure of the use cases for link templating. I'm not sure about templating of the HTTP link header, although I'm sure Mark has some specific cases in mind. I have at least one use case for a broader definition of URI templating, and I am not sure that the link header is the right place to specify it or the link type the place to specify how it is expanded. I wrote to Mark last week commenting on how useful it would be if his hinclude effort could be coupled with a templating mechanism:

Consider a commerce website built around the shopping cart model. If I am a user of that site, I may have urls that only relate to me. I may have my shopping cart at <http://example.com/shoppingcart/benjamin>. My name would form part of the url for the purpose of context. Urls specific to me are not the only ones that need to return me personalised content, however. Consider a product url such as <http://example.com/products/toaster>. That page may contain a link to my shopping cart or even include a mini-checkout facility as part of the page, and may include useful customer-specific information for my convenience.

The server could return a static product page to the user. The client side could rewrite the HIinclude src attribute before issuing the include request. This rewrite could take into account contextual information such as a username to ensure that the static page ended up being rendered with my own customisations included.

I think that perhaps the right technical place to specify such a mechanism is as part of the URI reference. It would remain illegal to include "{" or "}" characters in a URL, however a URI reference would allow their inclusion. All substitutions must be completed as part of the transformation from a URI reference to a URI. This process would use context-specific information to be defined by individual specifications, such as HTML or HTTP. HTML would likely have a means of transferring information from javascript or directly from cookies into the templating facility. Other contexts may have other means of providing the templating. If no specification is available for a particular context on how to perform the tranformation, the use of curly braces in a URI reference would effectively be illegal.

As for actually including the curlies into a URI rfc, I understand that might be taking the bull very firmly by the horns. Perhaps the notion of a templated URI reference to eventually be merged with the general URI reference would be the right track. One thing I don't think will be ideal is the specification of a separate Link-Template header to deal with these special URI references. There may be technical reasons beyond my current comprehension, but I think it would lead to a long term maintainability problem with respect to URIs and their references.

Benjamin

Sat, 2006-Jul-08

Platform vs Architecture

One of the challenges of working with older software is one of obsolescence. It is a challenge that I face in my professional life, and it appears to be something that is affecting the open source world. I write software in C++. Core GNOME developers write their software in C. We would all love to offer our platform in languages programmers commonly work with today. GNOME offers some libraries to languages other than C in bindings. This can be useful, but for some technically good reasons developers in Java like "pure Java" code. This can be true of other languages as well. Language bindings themselves can be a problem. Maintaining an interface to a C library in a way that makes sense in python is a full-time job in itself.

So what do we do with ourselves when the software we write doesn't fit into how people want to use it? What options do we have, and how do we maintain a useful software base as languages and technologies come and go?

To get the game into full swing, I would like to separate the notions of platform and architecture. For the purposes of this entry I'll define platform as the software you link into your process to make your process do what it should. I'll define architecture as the way different processes interact to form a cohesive whole. Within a process you need the platform to integrate pretty naturally with the developer's code. Defined protocols can be used between processes to reduce coupling, and reduce the need direct language bindings. From those base assumptions and definitions, whatever software we can keep out of the process is not going to have to sway with the breeze of how that process is implemented. This extricated software is a part of the architecture we can keep, no matter what platform we use to implement it.

The closest link I can find for the moment is this one, but discussion has cropped up from time to time over the last few years. It centres on whether Gnome is a software platform, or simply a set of specifications. Are the parts of Gnome going to be reimplemented in various languages, each with their own bugs and quirks? Will that be good for the platform or bad? Should these implementations be considered competing products, or parts of the same product?

The simplest answer for now is to sidestep the question. There are two approaches that allow us to do this, which I would characterise as model-driven or lesscode approaches. The model-driven approach involves taking what is common between the various implementations, and defining it in a platform neutral way. This can often and easily be done with the reading of configuration files or of network input. You define this model once, and provide individual mappings into the different platforms. These mappings may still be expensive to maintain, but it would allow developers to keep working on "common code" when it comes to real applications. A working example of this in the gnome platform is glade. Various implementations or language bindings for "libglade" can be created, or the widget hierarchy model can be transformed into platform-specific code directly.

Lesscode is an approach where we make architectural decisions that reduce the amount of platform-specific code we need to implement. Instead of trying to map a library that implements a particular feature into your process, split it out into another process. Do it in a way that is easy to interact with without having to write a lot of code on your side of the fence. The goal is to write less code overall, include less platform code, and implement more functions while we are at it.

While lesscode is something of an ideal, the tools are already with us. Instead of using an object-oriented interfacing technology, consider using REST. Map every object in the architecture to a URI. Now you only have to implement identifer handling once. Access every object in the architecture using a resource abstraction, such as a pure virtual C++ class or Java Interface. Find these resources through a resource finder abstraction.

What this does is put everyone on a common simple playing field. You no longer have to worry about which protocol is spoken at the application-level. Your platform reads the url scheme and maps your requests appropriately. The uniform interface means you only have to interface to one baseclass, not multiple libraries and baseclasses. The platform concept is transformed from an implementation technology into an interfacing technology.

Implementing REST in your system is not sufficient. GNOME is composed of a number of important libraries, not the least of which is gtk+ itself. Perhaps it is time to rearchitect, taking a leaf out of the web brower's book. Perhaps we should have a separate program dealing with the actual user interface. That handling could be based on a model just a little more expressive than that of glade's widget hierarchy. Desired widget content and attributes could be derived from back-end processes written in whatever way is most appropropriate at the time. Widget interactions could be transmitted back to back-end processes over a defined protocol. Perhaps Model-View-Controller isn't enough when expressed as three objects. Perhaps what is needed is two or more processes.

If a special interface is developed for speaking to this front-end process, nothing has been gained. It would be equivalent to providing the language bindings of today. What would be required is a general interfacing approach based around REST. The widget hierarcy model would specify where to get information from as URIs, and where to send notifications to as URIs. Alternatively, the model could simply leave its data open for subscription and leave it up to the other side to query and react to. The same RESTful signals and slots implementation could be used for interaction between all processes in the architecture.

My architectural vision is that each process incorporates a featherweight platform defined around RESTful communications. Which platform is chosen is irrelevant to the architecture. The fact that each platform implementation would be specific to the language or environment most suitable at the time would not be considered a problem. The features the platform implements are simply the essentials of writing all software. Specialty behaviours such as user interaction should be directed through processes that are designed to perform those functions. Linking in libraries to perform those interactions is something only a small number of processes in the system should be doing.

Web browsing is built around exactly this combination of lesscode and model-driven approaches. I think it is a template for the desktop as well.

Benjamin

Sat, 2006-Jun-17

REST XML Protocols

I would like to offer more than a "me too" in response to Kimbro Staken's 10 things to change In your thinking when building REST XML Protocols. The thrust of the article includes some useful points, but is not quite the set I would put forward. I'll crib from his list, expand, and rework.

  1. You are using REST and XML for forward and backward compatability of communications. Don't think in terms of Object-Oritentation or Relationional DatataBase Management Systems. These technologies are built around consistent modelling within a particular program or schema. REST and XML are there to give you loose coupling that will keep working even as cogs are pulled out of the overall machine and replaced. The interface between your applications is not an abstract model which can be serialised to xml. The interface is the XML and URI space. Abstract models need to be decoupled from the protocol on the wire. They can change and differ between different processes in the architecture, and they do.
  2. Schemas are not evil, but validation of input based on the schemas of today may be. From the perspective of compatability it is important not to validate input from other systems in the way current schema languages permit. Instead, consider a schema that defines how to parse or process your XML. You can generate your parser from this little schema language so your "hard" code only needs to deal with the fully-parsed view of the input.

    Define a schema which says which xml elements and attributes you expect. Define default values for this data wherever possible. Only declare elements as mandatory when your processing can literally not proceed without it. Don't expect your schema to match the schemas of other programs or other processing. They will have different priorities as to what is essential and which defaults they can safely assume.

    Validation of your own program's output is a worthwhile process, but don't expect to be able to use the same schema as is used on the other side of the connection. Your schema should describe what you expect to output. It should be used as part of your testing to pick up typos and other erroneous data output. You should expect the receiver of your data to process it based on whatever part it understands. Speaking gibberish may cause the remote to silently do much less than you expect.

  3. Consider backwards-compatability as soon as you begin developing your XML language. Borrow terminology and syntax heavily from existing formats so that your lexicon is as widely-understood as possible. Do this even to the point of embracing and extending. It is better to introduce a small number of terms to an existing vocabulary and namespace than it is to define your own format. Be sure to push your extensions upstream.

    XML is a dangerous technology for REST thinking. The use of XML is sometimes seen as part of the definition of REST. REST is actually more at home with html, png, mp3 and other standard formats. REST requires the XML format to be understood by everyone. You are actually only approximating REST if your XML format isn't understood by every relevant computer program in the world. Consider not inventing any new XML language at all.

    New software should still be able to process requests from old software: Once you have version "1.0", you are never permitted to remove structures or terminology.

    Old software should be able to process requests from new software: Once you have version "1.0", you are never permitted to add structures you depend on being understood.

    Corollary: Never encode a language version number into your XML file format. If someone has to key off the version number to decide how to parse an XML instance document, it is no longer the same XML language. Give it another name instead, or live with the limitations of the existing format.

  4. Always think of your XML instance document as the state of an resource (i.e. an object) somewhere. GET its state from the server. PUT a new value for its state back to the server. On the server side: Define as many resources as you need to allow for needed atomic state updates. These resources can overlap, so there may be one for "sequence of commands, including whether or not they are configured to skip" and another one for "whether or not command one is configured to be skipped". Use POST to create new objects with the state clients provide.
  5. Use hyperlinks to refine down from the whole state of an object to a small part of that state. Use hyperlinks to move from one object to another. Never show a client the whole map of your URI path and query structure. That leads to coupling. You can have such a map on the server side, but clients should be focused on viewing one resource at a time. Give them a resource big enough so that works.

    Letting clients know ahead of time the set of paths they can navigate through increases coupling, especially if that information is available at compile-time. Don't fall into the trap on the client side of saying "This as a resource of type sequence of commands, so I can just add `/command1` to the end of the path and I'll be looking at the resource for the first one. Servers can and do change their minds about how this stuff is organised.

  6. Use URIs for every identifier you care about. Make sure they are usefully dereferenceable as URLs. Even if you think that an id is local to your application, one day you'll want to break up your app and put the objects and resources that form parts of your app under different authorities.
  7. Use common representations for all simple types. Don't get funky with dates and times. Don't use seconds or days since an epoch. Use xsd as your guide to how things should be repsented within elements and within attributes.

Benjamin

Mon, 2006-Jun-12

Service Publication

On the internet, services are long-lived and live at stable addresses. In the world of IPC on a single machine, processess come and go. They also often compete for address space such as the TCP/IP or UDP/IP port ranges. We have to consider a number of special issues on the small scale, but the scale of a single machine and the scale of a small network are not that different. Ideally, a solution would be built to support the latter and thus automatically support the former. Solutions built for a single machine often don't scale up in the way solutions built for the network scale down.

So where is this tipping point between stable internet services and ad hoc IPC services? The main difference seems to be that IPC is typically run as a particular user on a host, in competition or cooperation with other users. Larger-scale systems resemble carefully constructed fortresses against intrusion and dodgy dealings. Smaller-scale systems resemble the family home, where social rather than technical measures protect one individual's services from another. These measures sometimes break down, so technical measures are still important to consider. The problem is that the same sorts of solutions that work on the large fortified scale don't work when paper-thin walls are drawn within the host. Where do you put the firewall when you are arranging protection from a user who shares your keyboard? How do you protect from spoofing when the same physical machine is the one running the service you don't trust?

For the moment let's keep things simple: I have a service. You want the service. My service registers with a well-known authority. Your client process queries the authority to find my service, then connects directly. The authority should be provide a best-effort mapping from a service name to a specific IP address and port (or multiple IP address and port pairs if redundancy is desirable).

  1. The authority should allow for dynamic secure updates to track topology changes
  2. The authority should not seek to guarantee freshness of data (there is always some race condition)
  3. The service should balance things out by trying to remain stable itself

The local security problems emerge when different lifetimes are attached to the client and the service. Consider the case where a service terminates before the client software that uses it. An attacker on the local machine can attempt to snatch the just-closed port to listen for requests that may be sent to it. Those requests could then be inspected for secret information or nefariously mishandled. A client that holds stale name resolution data is at risk. Possible solutions:

  1. Never let a service that has clients terminate
  2. Use kernel-level mechanisms to reserve ports for specific non-root users
  3. Authenticate the service separately to name resolution

Despite advances in the notion of secure DNS it is the last option that is used on the internet for operations requiring any sort of trust relationship. In practice the internet is usually pretty wide open when it comes to trust. Most query operations are not authenticated. Does it matter that I might be getting information from a dodgy source? Probably not, in the context of a small network or single host's services. The chances that I could make damaging decisions based on data from a source that I can see is not secure will often be low enough not to really consider the issue further. Where real risks exist it should be straightforward to provide the information over a secure protocol that provides for two-way authentication.

So, let us for the moment assume that name resolution for small-network or ad hoc services is not a vector for direct attacks. We still need to consider denial of service. If another user is permitted to diddle the resolution tables while our services are operating normally, they can still make our life difficult. On shared hosting arrangements where we can't rule this sort of thing out, we still should ensure that only our processes are registering with our names. For this, we need to provide each of our processes a key. That key must match the key within the authority service for service additions or updates.

Applications themselves can take steps to reduce the amount of stale data floating around in the naming authority. When malice is not a problem, services should be able to look up their own name on startup. If they find records indicating they are already registered, they can attempt to listen on the port already assigned. No name update is required.

A dbus-style message router can be shown to solve secure update and and stale data issues well enough for the desktop, however DNS also fits my criteria. DNS provides the appropriate mapping of service name to IP address and port through SRV records. These records can also be used to manage client access to a machine cluster distributed across any network topology you like. Some clients will have to be upgraded to fit the SRV model. That is somewhat chicken and egg, I am afraid. DNS also supports secure Dynamic DNS Updates to keep track of changing network topologies. This feature is often coupled with DHCP servers, but it is general and standardised. If a DNS server were set up for each domain in which services can register themselves, clients should be able to refer to that DNS server to locate services.

DNS scales up from a single host to multiple hosts, and to the size of the internet. Using the same underlying technology, it is possible to scale your system up incrementally to meet changing demands. The router-based solution is unable to achieve this, and also ends up coupling name resolution to message protocol. Ultimately, the router-based solution is neat and tidy within a particular technology sweet spot but doesn't meet the needs of more complex systems. I believe that DNS can meet those needs.

One problem that DNS in and of itself doesn't solve is activation. Activation is the concept of starting a service only when its clients start to use it. DBUS supports this, as do some related technologies. That problem can be solved in a different way, however. Consider a service who's only role is to start other services. It is run in place of any actual service with an activation requirement. When connections start to come in, it can start the real process to accept them. Granted, this means a file-descriptor handover mechanism is likely to be required. That is not a major inhibitor to the solution. Various services of this kind can develop indepdenently to match specific activation requirements.

Ultimately, I think DNS is the right way to find services you want to communicate with both on the same machine and on the local network. Each application should be configured with one or more names they should register, a DNS server to register with, and keys to permit secure updates. If the process is already registered, it should attempt to open the same port again. If it isn't registered or is unable to open the port, it should open a new one and register that. Clients should usually look up name resolution data before each attempt to connect to the service. They should be aware that their information may occasionally be stale and be prepared to retry periodically until they succeed. Clients and services should also be ready to operate over secure protocols with two-way authentication when sensitive data or operations are being exchanged.

Benjamin

Mon, 2006-May-29

Moving Towards REST - a case study

I mentioned in my last entry that I developed my own object-oriented IPC system some years ago, and have been paying my penance since. The system had some important properties for my industry, and a great deal of code was developed around it. It isn't something you can just switch off, and isn't something you can easily replace. So how is that going? What am I doing with it?

I am lucky in my system that I am working with a HMI quite decoupled from the server side processes. The HMI is defined in terms of a set of paths that refer to back-end data, and that data is delivered and updated to the HMI as it changes. To service this HMI I developed two main interfaces. There is a query/subscribe interface and a command interface. These both work based on the path structure, so in a way I was already half-way to REST when I started to understand the importance of the approach. Now, I can't just introduce HTTP as a means of getting data around. HTTP is something the organisation has not yet had a lot of experience with, and concerns over how it will perform are a major factor. The main concern, though, is integration with our communications management system. This system informs clients of who they should be communicating with, when. It tells them which of their redundant networks to use, and it tells them how long to keep trying.

A factor we consider very carefully in the communications architecture of our systems is of how they will behave under stressful situations. We need clients to stop communicating with dead equipment in a short period of time, however we also expect that a horrendously loaded system will continue to perform basic functions. If you have been following the theoretical discussions I have had on this blog over the last few years you'll understand that these requirements are in conflict. If A sends a message to B, and B has not responded within time t, is B dead or just loaded? Should A fail over to B's backup, or should it keep waiting?

We solve this problem by periodically testing the state of B via a management port. If the management port fails to respond, failover is initiated. If the port conintues to operate, A keeps waiting. We make sure that throughout the network no more pings are sent than are absolutely required, and we ensure that the management port always responds quickly irrespective of loading. Overall this leads to a simple picture, at least until you want to try and extend your service guarantees to some other system.

So, for starters they don't understand your protocols. If they did understand them (say you offered a HTTP interface) you would have to also add support for accessing your management interfaces. Their HTTP libraries probably won't support that. So you pretty much have to live with request timeouts. Loaded systems could lead to the timeouts expiring and to failovers increasing system load. Oh well.

So the first step is definately not to jump to HTTP. Step number one is to create a model of HTTP within the type system we have drawn up. We define an interface with a call "request". It accepts a "method", "headers", and "body" parameter list with identical semantics to those of HTTP. Thus, we can achieve the first decoupling benefit of actual HTTP. We can decouple protocol and document type, and begin to define new document types without changing protocol.

I changed requests over our command interface to now go over our mock HTTP. This means it will be straightforward in the future to plug actual HTTP into our applications either directly or as a front-end process for other systems to access. I added an extended interface to objects that recieve commands now so that they can have full access to the underlying mock or actual HTTP request if they so chose. They will be able to handle multiple content types by checking the content-type header. Since the change, our objects are not tied to our procotol. Their main concern is of document type and content, as well as the request method that indicates what a client wants done with the document. We can change protocol as needed to support our ongoing requirements.

Step two is to decouple name resolution from protocol. We had already done that effectivley in our system. Messages are not routed through a central process. Connections are made from point to point. Any routing is done at the IP level only. Easy. So we connect our name system to DNS and other standard name resolution mechanisms. We start providing information through our management system not only of services under our management, but also of services under DNS management only. The intention is that over time the two systems are brought closer and closer together. One day we will have only one domain name system, and we have a little while between then and now to think about how that unified system will relate to our current communications management models.

Alongside these changes we begin bringing in URL capabilities, and mapping our paths onto the URL space. We look up the authority through our management system, and pass the path on to whomever we are directed to connect to. Great! We can even put DNS names in, which is especially useful when we want to direct a client to speak to localhost. Localhost does not require a management system, which is what makes IPC simpler than a distributed comms system. There is no hardware to fail that doesn't affect us both. We can direct our clients to look at a service called "foo.bar", or use the same configuration file to direct our client to "localhost:1234". The extended support comes for free on the client side.

As the cumulative benefits of working within a RESTful model start to pile up, we are moving the functionality of other interfaces onto the command interface. As more functionality is exposed, more processes can get at that functionality easily without having to write extra code to do so. That is lesscode at its finest. Instead of building complex type-strict libraries for dealing with network communications, we just agree on something simple everywhere. We don't need to define a lot of types and interfaces. We just need the one. Based on an architectural decision, we have been able to get more cross-domain functionality for less work.

So, what is next? I am not a beliver in making architectural changes for the sake of making them. I do not think that polishing the bed knobs is a valuable way for a software developer to spend his or her time. We must deliver functionality, and where the price of doing things right is the same or cheaper than the price of doing things easily we take the opportunity to make things better. We take the opportunity to make the next piece of work cheaper and easier too. Over time I hope to move more and more functionality to the command interface. I hope to add that HTTP front-end, and perhaps integrate it into the core in the near to medium term future. I especially hope to provide simple mechanisms for our system to communicate with other systems using an architecture based on document transfer. Subscription will be in there somewhere, too.

The challenge going forward will be riding that balance between maintaining our service obligations, and making things simple to work with and standard. The obligations offered in my industry are quite different to those offered on the web, so hard decisions will need to be made. Overall, the change from proprietary to open and from object-oriented to RESTful will make those challenges worth overcoming.

Benjamin

Sun, 2006-May-28

Communication on the Local Scale (DBUS)

There is a divide in today's computing world between the small scale and the large scale. The technologies of the internet and the desktop are different. Perhaps the problem domains themselves are different, but I don't think so. I think that the desktop has failed to learned the lessons of the web. SOAP is an example of that desktop mindset trying to overcome and overtake the web. One example of the desktop mindset overcoming and overtaking the open source desktop is the emerging technology choice of DBUS.

DBUS is an IPC technology. Its function is to allow procesess on the same machine to communicate. It's approach is to expose object-oriented interfaces through an centralised daemon process that performs message routing. The approach is modelled after some preexisting IPC mechanisms, and like those it is is modelled after gets things wrong on several fronts:

  1. DBUS does not separate its document format from its protocol
  2. DBUS pushes the object-oriented model into the interprocess-compatiblity space
  3. DBUS does not have a mapping onto the url space
  4. DBUS does not separate name resolution from routing

From a RESTful perspective, DBUS is a potential disaster. I know it was (initially at least) targetted at a pretty small problem domain shared by kde and gnome applications, but the reason I feel strongly about this is that I have gone down this road myself before. I'm concerned that dbus will come to be considered a kind of standard interprocess communications system, and that it will lock open source into an inappropriate technology choice for the next five or ten years. I'll get to my experiences further down the page. In the mean-time, let's take those criticisms on one by one. To someone from the object-oriented world the approach appears to be pretty near optimal, so why does a REST practictioner see things so differently?

Decoupling of Document Format from Protocol

Protocols come and go. Documents come and go. When you tie the two together, you harm the lifecycle of both. DBUS defines a document format around method calls to remote objects. There have been protocols in the past that handled this, and there will be in the future. There are probably reasons that DBUS chose to produce its own protocol for function parameter transmission. Maybe they were even good ones. The important thing for the long-term durability of DBUS is that there should be some consideration for furture formats and how they should be communicated.

Objects for interprocess communication

The objects in an object-oriented program work for the same reason that the tables within SQL database work: They make up a consistent whole. They do so at a single point it time, and with a single requirements baseline. When requirements change, the object system changes in step. New classes are created. Old classes retooled or retired. The meaning of a particular type with the object system is unambiguous. It neither has to be forwards or backwards compatible. It must simply be compatible with the rest of the objects in the program.

Cracks start to emerge when objects are used to build a program from parts. Previous technologies such as windows DLLs and COM demonstrate that it is hard to use the object and type abstraction for compatability. A new version of a COM object can add have new capabilities, but must still support the operations that the previous version supported. It indicates this compatability by actually defining two types. The old type remains for backwards-compatibility, and a new one inherits from the old. A sequence of functionality advances results in a sequence of types, each inheriting from the previous one.

This in and of itself is not a bad thing. Different object systems are likely to intersect with only one of the interface versions at a time. The problem is perhaps deeper within object-orientation itself. Objects are fundamentally an abstraction away from data structures. Instead of dealing with complex data structures everywhere in your program, you define all of the things you would like to do with the data structure and call it a type. The type and data structure can vary independently, thus decoupling different parts of the application from each other. The trouble is that when we talk about object-oriented type, we must conceive of every possible use of our data. We must anticipate all of the things people may want to do with it.

Within a single program we can anticipate it. Between programs with the same requirements baseline, we can anticipate it. Between different programs from different organisations and conflicting or divergent interests, the ability to anticipate all possible uses becomes a god-like requirement. Instead, we must provide data that is retoolable. Object-orientation is built around the wrong design principle for this environment. The proven model of today is that of the web server and the web browser. Data should be transmitted in a structure that is canonical and easy to work with. If it is parsable, then it can be parsed into the data structures of the remote side of your socket connection and reused for any appropriate purpose.

Mapping to URL

DBus addresses are made up of many moving parts. A client has to individually supply a bus name (which it seems can usually be ommitted), a path to an object, a valid type name that the object implements, and a method to call. These end up coded as indivdual strings passed individually to dbus methods by client code. The actual decomposition of these items is really a matter for the server. The client should be able to unambiguously be able to refer to a single string to get to a single object. The web does this nicely. You still supply a separate method, but identifying a resource is simple. http://soundadvice.id.au/ refers to the root object at my web server. All resources have the same type, so you know that you can issue a GET to this object. The worst thing that can happen is that my resource tells you it doesn't know what you are talking about: That it doesn't support the GET method.

Let's take an example DBUS address: (org.freedesktop.TextEditor, /org/freedesktop/TextEditor, org.freedesktop.TextEditor). We could represent that as a url something like <dbus://TextEditor.freedesktop.org /org/freedesktop/TextEditor;org.freedesktop.TextEditor> It's a mouthful, but it is a single mouthful that can be read from configuration and be passed through verbatim to the dbus system. If you only dealt with the org.freekdesktop.TextEditor interface, you might be able to shorten that to <dbus://TextEditor.freedesktop.org /org/freedesktop/TextEditor>.

There are still a couple of issues with that url structure. The first is the redudancy in the path segment. That is obviously a design decision to allow different libraries within an application to regieter paths independently. A more appropriate mechanism might have been to pass base paths into those libraries for use when registering, but that is really neither here nor there. The other problem is with that authority.

Earlier versions of the uri specification allowed for any naming authority to be used in the authority segement of the url. These days we hold to DNS being something of the one true namespace. As such, we obviously don't want to go to the TextEditor at freedesktop.org. It is more likely that we want to visit something attached to localhost, and something we own. One way to write it that permits free identification of remote dbus destinations might be: <dbus://TextEditor.freedesktop.org.benc.localhost /org/freedesktop.TextEditor>. That url identifies that it is a local process of some kind, and one within my own personal domain. What is still missing here is a naming resolution mechanism to work with. We could route things through dbus, but an alternative would be to make direct connections. For that we would need to be able to resolve an IP address and port from the authority, and that leads into the routing vs name resolution issue.

Routing vs Name Resolution

The easist way to control name resolution is to route messages. Clients send messages to you, and you deliver them as appropriate. This only works, of course, if you speak all of the protocols your clients wants to speak. What if a client wanted to replace dbus with the commodity interface of http? If we decoupled name resolution and routing, clients that know how to resolve the name can speak any appropriate protocol to that name. The dbus resolution system could be reused, even though we had stopped using the dbus protocol.

Consider an implementation of getaddrinfo(3) that resolved dbus names to a list of ip address and port number pairs. There would be no need to route messages through the dbus daemon. Broadcasts could be explicitly transmitted to a broadcast service, and would need no special identification. They could simply be a standard call which is repeated to a set of registered listeners.

Separating name resolution from routing would permit the name resolution system to survive beyond the lifetime of any individual protocol or document type. Consider DNS. It has seen many protocols come and go. We have started to settle on a few we really like, but DNS has kept us going through the whole set.

Coupling and Decoupling

There are some things that should be coupled, and some not. In software we often find the balance only after a great deal of trial and error. The current state of the web indicates that name resolution, document transfer protocol, and document types should all be decoupled from each other. They should be loosely tied back to each other through a uniform resource locator. The web is a proven sucessful model, however experimental technologies like CORBA, DBUS, and SOAP have not yet settled on the same system. They couple name resolution to protocol to document type, and then throw on an inappropriate object-oritented type model to ensure compatability is not maintainable in the long term and on the large scale. It's not stupid. I made the same mistakes when I was fresh out of university at the height of the object-oriented frenzies of the late 90's.

I developed a system of tightly-coupled name resolution, protocol, and document type for my employer. It was tailored to my industry, so had and continues to have important properties that allow clients to achieve their service obligations across all kinds of process, host, and network failures. What I thought was important in such a system back then was the ability to define new interfaces (new O-O types) easily within that environment.

As the number of interfaces grew, I found myself implementing adaptor after adaptor back into a single interfacing system for the use of our HMI. It had a simpler philosophy. You provide a path. That path selects some data. The data gets put onto the HMI where the path was configured, and it keeps updating as new changes come in.

What I discovered over the years, and I suppose always knew, is that the simple system of paths and universal identifiers was what produced the most value across applications. The more that could be put onto the interfaces that system could access, the easier everything was to maintain and to "mash up". What I started out thinking of as a legacy component of our system written in outdated C code turned out to be a valuable architecural pointer to what had made previous revisions of the system work over several decades of significant back-end change.

It turns out that what you really want to do most of the time is this:

  1. Identify a piece of data
  2. GET or SUBSCRIBE to that data
  3. Put it onto the screen, or otherwise process it

The identity of different pieces of data is part of your configuration. You don't want to change code every time you identify a new piece or a new type of data. You don't want to define too many ways of getting at or subscribing to data. You may still want to provide a number of ways to operate on data and you can do that with various methods and document types to communicate different intents for transforming server-side state.

What the web has shown and continues to show is that the number one thing you want to do in any complex system is get at data and keep getting at changes to that data. You only occasionally want to change the state of some other distributed object, and when you do you are prepared to pay a slightly higher cost to achieve that transformation.

Benjamin

Tue, 2006-Apr-11

Namespaces and Community-driven Protocol Development

We have heard an anti-namespace buzz on the internet for years, especially regarding namespaces in XML. Namespaces make processing of documents more complicated. If you are working in a modular language you will find yourself inevitably trapped between the long names and the qnames and having to preserve both. If you use something like XSLT you will find yourself having to be extra careful to ensure you select elements from the right namespace, especially in any xpath expressions. It isn't possible in xpath to refer to an element that exists within the default namespace of the xslt document. It must be given an explicit qname.

Another hiccup comes about when working with RDF. It would be easy to produce compact rdf documents if one could conveniently use xml element attributes to convey simple literals. One thing that makes this more difficult is that while xml document elements automatically ineherit a default namespace, attributes get the null namespace. RDF uses namespaces extensively, so you will always find yourself filling out duplicate prefixes for attributes in what would otherwise be quite straightforward documents. This makes it difficult to both define a sensible XML format and to make it "RDF-compatible".

A new argument for me against the use of namespaces in some circumstances comes from Mark Nottingham's recent article on protocol extensibility. He argues that the use of namespaces in protocols has a social effect, and that the effect leads to incompatability in the long term. He combines this discussion with what he paints as the inevitable futility of "must understand" semantics.

Protocol development is fundamentally a social rather than a technical problem. In a protocol situation all parties must agree on a basic message structure as well as the meaning of a large enough subset of the terms and features included to get useful things done. A server and client must broadly agree on what HTTP's GET method means, and intermediataries must also have a good idea. In HTML we need to agree that <p> is a paragraph marker rather than a punctuation mark. These decisions can be made top-down, but without the user community's support such decisions will be ignored. Decisions can be made from the bottom up, but at some stage coordinated agreement will be required. Namespaces provide a technical solution to a social problem by allowing multiple definitions of the same term to be differentiated and thus to interoprate. Mark writes:

What I found interesting about HTML extensibility was that namespaces weren't necessary; Netscape added blink, MSFT added marquee, and so forth.

I'd put forth that having namespaces in HTML from the start would have had the effect of legitimising and institutionalising the differences between different browsers, instead of (eventually) converging on the same solution, as we (mostly) see today, at least at the element/attribute level.

HTML does have a scarce resource, in that the space of possible element and attribute names is flat; that requires some level of coordiation within the community, if only to avoid conflicts.

Dan Connolly is writing on obliquely the same subject. He is also concerened about the universe without namespaces, but his main concern is that protocol development decisions get adequate oversight before deployment. Dan writes:

We particularly encourage [uri-based namespaces] for XML vocabularies... But while making up a URI is pretty straightforward, it's more trouble than not bothering at all. And people usually don't do any more work than they have to.

There is a time and a place for just using short strings, but since short strings are scarce resources shared by the global community, fair and open processes should be used to manage them. Witness TCP/IP ports, HTML element names, Unicode characters, and domain names and trademarks -- different processes, with different escalation and enforcement mechanisms, but all accepted as fair by the global community, more or less, I think.

Both Dan and Mark end up covering the IETF convention of snubbing namespaces, but using a "x-" prefix to indicate that a particular protocol term is experimental rather than standard. It is Dan that comes down the hardest on this approach citing the "application/x-www-form-urlencoded" mime type as a term that became entrenched in working code before it stopped being experimental. It can't be fixed without breaking backwards-compatability, and there doesn't seem to be a good reason to go about fixing it.

Both Mark and Dan have good credentials and are backed up by good sources, so who is right? I think they both are, but at different stages in the protocol development cycle.

So let's say that the centralised committee-based protocol development model is a historical dinosaur. We no longer try to make top-down decisions and produce thousands of pages of unused technical documentation. So now how do new terms and new features get adopted into protocols and into document types? It seems that the right way is to the following process:

Marks suggests that using namespaces within a protocol may unhelpfully encourage communities to avoid that third step. The constraints of a short string world would force them to interoperate and to engage one another on one level or another and doesn't produce a result of "microsoft-this" and "netscape-that" littered throughout the final HTML document. Using short strings produced a cleaner protocol definition in the end for both HTTP and HTML, and forced compromises onto everyone in the interests of interoperability. If opposing camps are given infinite namespaces to work with they may tend towards diverent competing protocols (eg RSS and Atom) rather than coming back to the fold and working for a wider common good (HTML).

Dan criticises google's rel-nofollow in his article, saying:

Google is sufficiently influential that they form a critical mass for deploying these things all by themselves. While Google enjoys a good reputation these days, and the community isn't complaining much, I don't think what they're doing is fair. Other companies with similarly influential positions used to play this game with HTML element names, and I think the community is decided that it's not fair or even much fun.

I think that google is taking problably a less-community-minded approach than they may have done. Technorati is also criticised for rel-tag. Both relationship types started with a single company wanting to have a new feature, and there is foundataion for criticism on both fronts. Both incidents appear to have developed in a dictatorial fashion rather than by engaging a community of existing expertise. Technorati's penance was to blossom into the microformats community, a consensus-based approach with reasonable process for ensuring work is not wasted.

HTML classes are a limited community resource, just as HTML tags are. This resource has traditionally been defined within a single web site without wider consideration. Context disambiguated the class names, as only the css and javascript files associated with a single site would use the definitions from that site. Microformats and the wider semantic HTML world have recently taken up this slack in the HTML specification and are busy defining meanings that can be used across sites. The HTML elements list is not expanding, because that is primarily about document structure. HTML classes are treated differently. They are given semantic importance. Communities like microformats will spend the next five years or so coming up with standard html class names and do the same with link types. They will be based on existing implementation and implied schema, and will attempt not to splinter themselves into namespaces. Other communities will develop, and may collide with the microformats world. At those times there will be a need for compromise.

We are headed into a world of increasingly rich semantics on the web, and the right way to do it seems to be without namespaces. Individuals, groups and organisations will continue to develop their own semantics where appropriate. Collisions and mergers will happen in natural ways. The role of standards bodies will be to oversee and shape emerging spheres of influence in as organic a way as possible, and to document the results of pushing them through their paces.

Benjamin

Tue, 2006-Apr-04

Low and High REST

There has been a bit of chatter as of late about low and high REST variants. Lesscode blames Nelson Minar's Etech 2005 presentation for the distinction between REST styles. It pretty much amounts to the read-only web verses the read-write web, or possibly the web we know works and the web as it was meant to work (and may still do so in the future).

The idea is that using GET consistently and correctly can be called "low". It fits the REST model and works pretty well with the way information is produced and consumed on the web of today. Using other verbs correctly, especially other formally-defined HTTP verbs correctly, is "high" REST. The meme has been spreading like wildfire and lesscode has carried some interesting discussion on the concept.

Robert Sayre notes that the GET/POST/PUT/DELETE verbs aren't used used in any real-word applications. He says that low REST might be standardising what is known to work, but high REST is still an untested model. Ian Bicking calls the emphasis on using verbs other than POST to modify server-side state a cargo cult.

It is useful to look back at Fielding's Dissertation, in which he doesn't talk about any HTTP method except for GET. He assumes the existence of other "standard" methods, but does not go into detail about them.

I think Ian is hitting on an uncomfortable truth, or at least a half-truth. Intermediataries don't much care whether you use POST, DELETE, or PUT to mutate server state. They treat the requests in similar ways. If you were to use webdav operations you would probably find the proxies again treating the operations the same way as if you had used POST. Architecturally speaking, it does not matter which method you use to perform the mutation. It only matters that the client, intermediataries, and the server are all of the understanding that mutation is occuring.

Even that constraint needs some defense. Resource state can overlap, so mutating a single resource state in a single operation can in fact alter several resources. Neither client or intermediatary is aware of this knock-on effect. The only reason that clients really need to know if mutation is happening or not is for machines to determine whether they can safely make a request without their user's permission. Can a link be followed for precaching purposes? Can a request be retried without changing its meaning?

Personally I am a fan of mapping the operations DELETE to cut, GET to copy, PUT to paste over, and POST to paste after. I know that others like to map the operations to the database CRUD model: POST to create, GET to retrieve, PUT to update, and DELETE to delete. It amounts to the same thing, except that the cut and paste view steers us more firmly away from record-based updates and into the world of freeform stuff to stuff and this to that data flows. Viewing the web as a document transfer system makes other architectures simpler, and makes them possible.

I have mentioned before that I don't think the methods should end at that. There are specialty domains such as subscription over HTTP that seem to demand a wider set of verbs. Mapping to an object-oriented world can also indicate more verbs should be used, at least until the underlying objects can be retooled for easier access through HTTP. Robert Sayre points at this too, but I think he is a little off the mark in his thinking. I think that limiting the methods in operation on the internet is a bad thing, however limiting the methods a particular service demands clients use is a good thing. Every corner will have its quirks. Every corner will start from a position of many unnecessary SOA-style methods before really settling into the way the web really handles things. It is important for the internet to tolerate the variety while encouraging a gradual approach to uniformity.

We should have some kind of awareness of what methods we are using because it helps us exercise the principle of least power. It helps us decouple client from server by reducing the client requests to things like: "store this document at this location", "update that document you have with the one I have". By moving towards less powerful and less specific methods as well as less powerful and less specific document types we reduce the specific expectations a client will have of its server. Sometimes it is necessary to be specific, and that should be supported. However, it is a useful exercise to see how general a request could possibly fulfil the same role.

My issue with using POST for everything is that what we really often mean is that we are tunnelling everything through POST. I see it as important that the opertations we perform are visible at the HTTP protocol level so that they can be handled in a uniform way by firewalls and toolkits and intermediataries of all kinds. Information about what the request is has to be encoded into either the method or into the URI itself, or we are just forcing ourselves to interrogate another level of abstraction in the operation of our intermediataries.

You could take this discussion and use it to support making POST a general "mutate" method. If one mutation operation applies to a single URI then it makes sense to use a very general mutation method. In this case we are encoding information about what the operation is into the URI itself rather than selecting the mutation by the method of our request. Instead of tunneling a variety of possible operations through the POST, it is the URI that tunnels the information. Since that is managed by the server side of the request, that is the really best possible outcome. It is only when multiple methods apply to a single URI that we need to carefully consider methods other than POST and ensure that appropriate methods can be used even if they haven't been standardised. Future-proofing of the URI space may dictate the use of the most appropriate method available. Unfortunately, existing toolkits and standards push POST as the only method available.

In my view a client or intermediatary that doesn't understand a method it is given to work with should always treat it as if it were POST. That is a safe assumption as to how much damage it could do and what to expect of its results. That assumption would allow experimentation with new methods through HTTP without toolkit confusion. I am not a supporter of POST tunneling, and believe generally that it is lack of support for unknown methods in specifications and in toolkits that makes tunneling necessary and thus successful on the internet of today.

Benjamin

Mon, 2006-Apr-03

Linux Australia Update Episode #16

As many of you will know, I participated in a telephone interview some weeks back for James Purser's Linux Australia podcast. With his permission, here is a transcript of of that interview. Any errors or inaccuracies are my own. If you would like to refer to the original interview please see James' podcasts page. Questions asked by James are prefixed with "JP". Other text is myself speaking.

The major change that has occured in the project since this podcast is that Efficient Software has dropped the concept of a central clearing house for open source software development funding. We now encourage individual projects to follow an efficient software model by accepting funds directly from users. See the Efficient Software project page for more details about what Efficient Software is, and where it is going.

JP: Ok, getting right into our interview with Benjamin here he is explaining what Efficient Software is:

Well efficient software is a project idea that I have had for a few years but it has developed a lot since the AJ Market came online.

JP: Just before we go on, I'm sorry. For those who are listening who don't know what the AJ Market is, can you tell us what AJ has been doing there?

Anthony Towns is a Debian developer. He has opened up a market on his own website where you can submit money to him and it will drive (theoretically) the development he is doing in that week. So it is a motivating factor for him to get some money in his hands, and is also a way for the community to drive him into particular directions as to how he spends his time. He publicised that on his blog and on planet linux australia, so that spurred things along a bit. So this project is similar, a similar mindset. I would like to iron out some of the reasons why users would contribute and look forward a bit as to where we will be in ten years time or twenty years time when closed source software is possibly a less dominant force in the market.

So at the moment it is very little. It is a mailing list. It is an irc channel. We are in a phase of the project where we are just discussing things. We are talking about what this world will look like twenty years out from now.

JP: Ok, well let us in. What is this kind of world view you have got that is behind Efficient Software?

If we imagine a world where open source software has "won the battle", that it's freedom everywhere and everybody can get at the source code and free software is developed. You have to ask the question: Who is it developed by? We have pretty good answers to that at the moment. We have people who are employed by open source friendly companies. We have people who have a day job and they spend their weekends and free time doing software development for free software projects. They have the motivations and the itches to scratch to contribute. But there is a conflict, a fundamental conflict when you have people who are working part time, especially on software development. They have time and money in conflict. They need to earn money, they need to have a day job in order to have the free time to spend on open source. The idea is broadly that we want to be able to fund people as much as we can and we want to fund them as directly as we can potentially from the community itself. When you take out the closed software world a big segment of the actual day job marketplace disappears for software developers.

JP: Yeah. That would be say 80-90% of the employers of software developers, wouldn't it?

Yeah. So we can look forward and see this potential conflict approaching where open source adoption slows down because nobody is willing to give up their day job. They are afraid of contributing because they may want to keep their jobs. What I'm really looking to is how to solve that conflict between the time you want to be able to spend on open source and the money by aligning those two things. Being able to get a direct funding of the developers.

JP: Cool. So what would you be setting up with Efficient Software? What is the current sort of model you are looking at?

Well, I have a strawman and this is preliminary and this is mostly my thinking. I am eager to take on board comments and consider even radical alternatives to this. What I'm currently laying out is a kind of eBay-style website that essentially becomes a marketplace, a central clearing house for enhancements to open source software. The idea is that customers, or users, however you want to think of them... investors, because investment and contributing they are the same thing in open source really. If the customers will find particular enhancements they want, they will be modelled as bugs in a bugzilla database. They will have a definiate start and conclusion and close out and verification process assocated with them. The idea is that the community (that could include individuals and businesses or whatever interests there are the support that particular project) can contribute money to a pool that will be paid to the project once the bug is verified or once that release cycle is complete. So there is a clear motivation there to contribute, hopefully, to the project. You are going to get your bug fixed at a low cost. You can put in a small amount of money and hopefully other people will put in small amounts of money also, and it will build into a big enough sum to make it appetising to go and fix the bug. Then maybe the developer can still pay their bills that week. There is a motivation as well from the developer's side to go for the biggest paying bugs and to try and meet the market expectations of the software.

JP: Have you considered the other systems that we have currently got for paying open source developers, which is say Redhat's and any of the corporate Linuxes where you pay for support, or Sourceforge's donation system where you can go and actually donate to any of the projects?

I think that is a very interesting sort of case study. If you look at a business that is selling support, one of the interesting things I have found in open source development is that often the support you can get for a fee is not as good as you can get for free. It doesn't have the same kind of wide customer base that a genuine open source project has. In an open source world people contribute not only a bit of code, but they will also contribute a bit of time by helping other people out. The reason they do that is that they get a lot of support in response to that. They can put a small amount of investment in and get a great yeild off of that. Commercial support at the moment is good for making business feel very comfortable about their linux investments. You can buy a Redhat CD and install it on your machine and you have your support for that particular machine, and if you want ten machines you buy ten support agreements. It is very much the closed source software development model in that costs in developing the CD are returned after the CD is produced. They also have the support mechanisms in there which is a useful and still will probably still be an important part of the business, the economy of open source going forwards.

Sourceforge is another interesting one where they have opened it up for donations, and that has happened fairly recently. Over the last twelve months, I think. Any community member can contribute to a particular project that they like. My fundmanental concern whith that is there there is no real economic incentive to do that. There are two reasons I can think of economically to contribute to an open source project through that sort of model. One is that it you think it is on the ropes and you want to keep it ticking along so that the investments you have made already will continue to bear fruit as more people put contributions into that particular product. Also there is a sort of "feel good" factor. You might like the product and want to reward the developers. In that sort of situation it is very difficult to determine exactly how much you should actually put towards the project. It goes back to recouping costs after the development has taken place, and ideally we would like to be able to pay the developer as they develop the code rather than come along several weeks or months later and say "I like what you have done, here is a thousand bucks". I am interested in trying to find an economic basis for that relationship to exist between the customer and the producers of the software.

JP: As you mentioned before you have blogged a fair bit about Efficient Software. Including a discussion you had at HUMBUG at the last meeting. What has the response been like?

It has been very interesting. So far it has been fairly minimal, but at HUMBUG we had really good discussion about basically the prelimiary scoping of the exercise of the whole project. We got to talk through the issues of how you get from some other business model some greengrocer or a certain internet search engine and how you get that money feeding to an open source software developer to pay for their day job. We just started to map out the whole industry and work out where the money is coming from and how we can get as direct a flow of money to the developer and as most efficent flow as we can get that will reward the developer for meeting real customer expectations. We discussed a lot of other issues as well and I blogged about them and you can read that on http://planet.linux.org.au/ or my blog at http://soundadvice.id.au/. We have just gone through some preliminary scoping and we are still in very much a discussion phase about efficient software. What I put forward is a strawman, and it is not really intended to be the final model. I think there are some pros and cons which we should really work through and compare to other businesses in a much more detailed way.

JP: So if this were a software project you would really say you were in the Alpha stage of development?

Yeah, absolutely. It is all new. It is all fresh. I don't know if it will fly. I think there are reasons to believe it will succeed. I think there are economic reasons to think that open source software will always be more efficient and cost-effective than closed source software. Particularly due the forking capability of open source there is a very low barrier to entry. If I want to provide a whole new infrastructure for running the project I can just take your source code and I can run with it. If I am more efficient, if I am better at it than you, then I will be the one who ends up at the end. Most of the time what happens is that projects collapse back because there isn't enough of a customer base to really draw from. That includes time and skills of developers, of people supporting other people who are using the product. Ultimately the Efficient Software goal is just to extend that so that money is another thing that your community can provide in an open source sort of way. As they have an itch they can provide either their time, their skills, or some portion of money. I think as we move to less technical arenas in open source... open source really started in operating systems and tools and things. It has expanded a long way from that. We are getting to things like Gnome which which is really meant for the average desktop user. The average desktop user doesn't necessarily have the time or the skills to really put forward to code development. I think there is an untapped supply of funding for developers who are willing to take on that relationship with that sort of community: A community which is less technical and is more interested in their own way of life and their own day jobs.

JP: Do you see Efficient Software being able to benefit the whole range of projects, from your single man developer to your Apache, Gnome, or Linux kernel?

That's really my picture of where this will go. I think there is a barrier as to where this can penetrate, but the barrier is really about whether the software is mass-market and whether it has has mass-appeal. Those are the same sorts of barriers as open source hits anyway. I think this will most benefit non-technical community bases or less-technical community bases and will probably have a lesser impact on things like the Linux kernel where no desktop user will have a specific bug they will want to address in the Linux kernel, necessarily. There may be some flow through. Say Gnome were taking on this model, and they were acquiring a reasonable funding base from their community which is a more non-technical community they may have a reason to reinvest in the Linux kernel itself. Whether that be reinvesting of developer time and resources, or whether that indeed went upstream as money. As we reach out to less technical fields you will see a more money-oriented open source develoment and as we move to more the tehnical areas then it will be people who have the time and skills and are pushing those time and skills upstream.

Benjamin

Thu, 2006-Mar-30

Desktop Identifiers

The Resource Descriptor Framework (RDF) is based around the use of Uniform Resource Identifier (URI) rerefences. An RDF statement identifies a subject, an object, and a predicate. The subject is a URI that says what the statement is about. The object is a URI or a literal value. The predicate is a URI that identifies the relationship between the subject and object. A collection of statements forms a graph, and this graph is sufficient to describe a logical model of anything and everything. Anything that understands the meaning of the whole set of predicates can understand the whole graph. Anything that understands a subset of the predicates will understand a corresponding subset of the graph. Different parts of the graph can be controlled by different agencies, so long as each identifier used in the graph is unique. The uniqueness of identifiers is the cornerstone of making the system work.

The deep dark secret of URIs is that they are hard to come up with. The problems of a single URI having multiple meanings has been reasonably well canvassed, but the initial URI selection is still a difficult problem. What is the correct URI to use for the iso4217 currency symbol AUD (AUstralian Dollar)? Should iso4217 be used as a scheme to make iso4217:AUD the uri? Is the scheme just "iso", and the URI iso:4217:AUD? Do we trust oasis and use urn:oasis:ubl:codeList:ISO4217: Currency%20Code:3:5:ISO::AUD? How about mddl, or www.xe.com?

Scaling down a touch, which URI do I use to identify a record in my email agent's address book? What about for a file in my filesystem? Is file:/home/fuzzy/accounts.db good enough? How about http://localhost:1234/? Just as in the iso4217 case, I have an identifier. I just don't have an agreed context to work with. Sean McGrath writes:

Utterances are always a rich steamy broth of the extensional and the contextual. The context bit is what makes us human. We take short-cuts in utterances all the time. That is the context. Obviously, this drives computers mad because computers don't do context.

The problem is not unique to RDF. Whenever we have two databases managed by different applications that want to refer to each other, we have a problem. Just how much context do we provide? If I want my accounting application to relate somehow to my email client's address book, what is the best way to do it? If I want my stock market monitor application to match up with my accounting application's records, what key should I use? If the two pieces of data were in the same relational database the problem would be easy to solve, but the schemas of these two databases are controlled by different agencies. In the general case their data models should be able to evolve independently of each other, but there are points at which their data models interact. Those points should be controlled with identifiers that carry enough context to determine whether they indeed refer to the same entity.

I am finding myself itching to solve the desktop accounting problem again. I want to define the cornerstone of the overall data model now. I want to define what a transaction looks like. Transactions have entries, and transaction entries link to accounts. I want the model of what an account looks like to evolve separately to that of a transaction, because it is a much fuzzier concept. It has a lot to do with strange ledgers that refer to specific problem domains. These problem domains don't impact on the core transaction representation, nor do they impact the major financial reporting and quering activities. I would like to be able to provide hard dependable definitions of the hard dependable parts of my data model without setting soft definitions in concrete.

I feel like the best way to achieve something like that is to have a database of transactions alongside one or more databases of accounts. A common key could bind the two data models together. Transactions themselves could have extra information attached to them by in a separate database. Core query and reporting capabilities need only depend on information in the core transactions database. Clever reports and domain-specific ledgers could make use of additional information to mark up transactions and accounts. The ideal key to bind these databases together would be a uniform identifier. That would allow me to unambiguously move these databases around and combine them with other databases in different context. Within a single database I could use simple integer keys (or RDF blank nodes). In a universal database I need to use uniform identifers. Is there a middle line for databases that are spread only across a desktop or corporate context, or is there an easy universal scheme I could use?

We are pretty much working in the world of machine-generated identifiers, now. That may mean we can take microsoft's old favourite technique on board and make use of a machine-generated globally unique identifier. Human-readability is not all that important, so long as the identifier is easy to generate and otherwise work with in the database. Full GUIDs could be used whenever and identifier is used in the form urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 as per rfc4122. We could alternatively try to use it as the context only in the identifier, eg http://localhost/uuid/ f81d4fae-7dec-11d0-a765-00a0c91e6bf6/321 for record 321. We can't attach 321 to the urn:uuid uri because the rfc does not permit it, but this localhost business is still a grand hack.

We could dodge the whole question of context for a time by using a relative uri or uri reference. If we treat the database as a document with its own URI, We could use the identifier "#transaction31" to stand for a unique identifier within the document. This doesn't solve the problem, really, because chances are the database is being located at either file:/home/benjamin/my.db (giving a full url of file:/home/benjamin/my.db#transaction31) or at http://localhost:1234/ (giving a full url of http://localhost:1234#transaction31). Importantly, anything that refers to the identifier using either one of these paths depends on the same port on localhost being opened every time your application starts. It depends on the database being found at the same path every time. In fact, we could make use of a relative URI again. If I have a database at file:/home/benjamin/my.db and another at file:/home/benjamin/myother.db, the two could refer to each other with the relative paths "my.db" and "myother.db". They could refer to each other's identifiers as "my.db#transaction31" and "myother.db#account12". So long as both files moved together their context really could be for the most part ignored.

Perhaps these non-universal universal identifiers are good enough. Perhaps we will never use these databases outside of the context of their original paths on their original machines. Perhaps we will learn to contol the movement of documents and data around a desktop as carefully as we must on the open internet. Perhaps a dns-style abstraction layer is the solution. I think choosing an identifier is still a hard problem, especially in a world at the cusp of the online and offline worlds.

Benjamin

Sat, 2006-Mar-18

Emergent Efficient Software

This bugzilla comment came across my desk this morning:

If there are any developers out there interested in implementing this functionality (to be contributed back to Mozilla, if it will be accepted) under contract, please contact me off-bug, as my company is interested in sponsoring this work.

The comment applies to a mozilla enhancement request for SRV record support. SRV records are part of the DNS system, and would allow one level of server load balancing to be performed by client machines on the internet of today. Unfortunately, HTTP is a protocol that has not wholeheartedly embraced the approach as yet. What I think is interesting, however, is that there are customers who find bugs and want to throw money at them.

The extent to which this is true will be a major factor in the success of Efficient Software. This particular individual would like to develop a 1:1 releationship with a developer who can do the work and submit it back to the project codebase. I wonder how open they would be to sharing the burden and rewards.

This is the kind of bug that seems likely to attract most intrest for the Efficient Software initiative. It has gone a long time without a resolution. There is a subset of the community with strong views about whether or not it should be fixed. There seems to be some general consensus that it should be fixed, but for the project team for whatever reason it is currently not a priority.

It is unclear whether putting money into the equation would make this bug a priority for the core team, or whether they would prefer to stick to more release-critical objectives. There may be a class of developer who is more occasional that could take on the unit of work and may have an improved incentive if money were supplied. That is how I see the initiative making its initial impact: By working at the periphery of projects. I don't see the project being a good selector of which bugs should recieve funds because, after all, if the cashed up core developers thought it was release critical they would already be working on it. No, it is the user base who should supply the funds and determine which bugs they should be directed to.

There are important issues of trust in any financial transaction. I think that an efficient approach can adress these issues. The individual who commented on the SRV record bug is willing to contract someone to do the work, but whom? How do they know whether the contractor can be trusted or not? The investor needs confidence that their funds will not be exhausted if they are supplied towards work that is not completed by the original contractor. Efficient software does this by not paying out the responsible party (the developer or the project) until the bug is actually resolved. Likewise, the contractor must know their money is secure. Efficient Software achieves this by requiring investment is supplied up-front into effectively an escrow account while the work is done.

The biggest risk for an investor is that they will put their money towards a bug that is never resolved, despite the incentive they provide. A project may fork if funds are left sitting in the account. The investor's priorities may change. They may want that money put to a more productive use. I don't know of any way to mitigate that risk except to supply more and more incentive, or to first find a likely candidate to perform the implementation before actually putting funds into escrow. Perhaps the solution is to allow investors to withdraw funds assigned to a bug up until the bug is commenced. Once work is started, the money cannot be withdrawn. If the developer fails to deliver a resolution they may return the bug to an uncommenced state and investors can again withdraw funds to put to a more productive use.

The fact that the efficient software approach is an emergent phenomenon gives me increased confidence that it can be developed into workable process. In time, it may even become an important process to the open source software development world. Do you have comments or suggestions regarding an efficient approach to software? Blog, or join us on our mailing list.

Benjamin

Sun, 2006-Mar-12

Bounty Targeting

Bounties have been traditionally seen in open source as a way of brining new blood into a project, or increasing the pool of developer resources available to open source. Offering money for the production of a particular feature is intended to inspire people not involved with the project to come in, do a piece of work, then go back to their day to day lives. The existing developers may be too overworked to implement the feature themselves due to preexisting commitments. The item of work may even be designed to cross project boundaries and inspire cooperation at a level that did not exist before the bounty's effect was felt.

There are seveeral problems with this view of a bounty system, but perhaps the most important is one that Mary Gardiner identifies:

I mean, these things just seem like a potential minefield to me. And I don't mean legally, in the sense of people suing each other over bountified things that did or did not happen or bounties that did or did not get paid. I just mean in the sense of an enormous amount of sweat and blood spilled over the details of when the task is complete.

The point she makes is that it isn't possible to simply develop new feature x as a stand-alone piece of software and dump it into someone else's codebase. There is a great deal of bridge building that needs to happen on both the technical and social levels before a transfer of code is possible between a mercenary developer and an fortified project encampment.

These are the same kinds of issues a traditional closed software house has when they hire a contractor. Who is this contractor? What are their skills? Why are they being paid more than me? Will they incorporate into our corporate culture? Will their code incorporate into our codebase? Will they follow our development procedures and coding standards? There are plenty of ways to get each other off-side.

I consider it important to look for in-house talent. I don't think bounty systems should be geared towards the outside contractor, but instead to the core development team. I don't think bounty funds should be provided by the core development team to outsiders. Instead, I see bounties as a way for users of free software to contribute effectively to the core development team.

The Effient Software view of bounty collection and dispersion is that bounties are paid to developers who are already integrated on a social and techinical level with the core team. They may be full time or part time. They may work with other projects as well. This does not make them a mercenary. These are the people who don't come to the project just to do a single job. They watch the mailing lists. They spend appropriate time in irc channels and involved in other forms of instant communications for the sake of resolving technical issues. It is the core developer who should be rewarded for meeting the needs of the project's user base. It is the core developer who has the best chance of a successful development.

Finding the conclusion of the development should be straightforward and uncontraversial. It is as per project policy. The policy may be that an initial code drop is sufficient to collect a bounty. The policy may require a certain level of unit testing or review. It may require a certain level of user satisfaction. Because the developer is engaged in the policy process, the process is not a surprise or a minefield. Newer developers may be attracted to the project by successful funding of more established developers, and will have to break into the culture and policy... but that is to be expected when an outsider wants to become part of any core development group. The newcomer learns the policies over time, and the policies are as reasonable as the project needs them to be to both attract new blood and to fund the project as a whole. The interesting thing about open source is that if they get this balance wrong, it is likely they will be outcompeted by another group working on a fork of their software. The incentive is strong to get it right.

Money is a dangerous thing to throw into any organisation, and most open source projects get by without any dependable supply. There are real risks to changing your development model to one that involves an explicit money supply. I see rewards, however, and I see an industry that is ready to turn down this path. I think open source is the best-poised approach to take this path to its natural conclusion of efficient software production.

Benjamin

Sun, 2006-Mar-05

Free Software is not about Freedom of Choice

I was at HUMBUG last week, and was involved in a wide-ranging discussion. The topic of a particular closed-source software product came up, and a participant indicated that he maintained a windows desktop just to run the software. It was so good and integral to his work practices that he had a whole machine dedicated to it. He went on to criticise sectors of the open source community who tended to be irritated that closed source software was still in use. These are the sectors who have somewhat of a "with us" or "against us" view, and would prefer that closed source not be a part of anyone's lives. He asked (I think I'm getting the words right, here) After all, isn't free software about freedom of choice?.

I don't think it is.

Software alternatives are about freedom of choice. Whether the alternative is open source or closed source, the freedom of choice is not really affected. If I wrote a closed source alternative to Word, I would be providing freedom of choice to consumers. If I wrote an open source alternative to Word, I would be providing the same kind of freedom of choice. The difference is in the freedom of the customer once a transaction has been made. Open source software is primarily about post-choice customer freedom rather than freedom of choice, so it makes sense on at least one level for free software advocates to actively seek out unshackled alternatives to any closed source software they use from day to day.

In the software world we would traditionally see the freedoms of a consumer and the freedoms of a producer of software to be in conflict, however the foundation of open source development is to view the separation of consumer and producer as artificial. Freedoms given to the consumer are also given back to the producer, because the producer is also a consumer of this software. The barrier between consumer and producer exists naturally when only one entity is doing the producing. In that case the producer has automatic freedoms, and granting more to themselves has no meaning. However, consider the case of multiple producers. The freedoms granted to consumers are also granted to every producer when the production is shared between multiple entities. Open source produces a level playing field where entities that may compete in other areas can each limit the cost of this particular shared interest domain by working together.

When viewed from the angle of productivity improvement in a domain of shared interests, closed source alternatives can seem ugly and limiting. You will always know you are limited in closed source no matter how featureful a particular product is. You often can't make it better, and it would cost you a great deal to produce a competitive alternative as an individual. If competative alternatives exist you may be able to transition to one of the available alternative products, however you will still be in the same boat. You can't add a feature, and it only the threat you may change to another competitor that drives the supplier to use your license fee to produce software that suits you better. The competitors won't be sharing their code base with each other, so the overall productivity of the solution is less than the theoretical ideal. If the competitors joined forces they may be able to produce a more abundant feature set for a lower cost, however while they compete the customer pays for the competition. Which is worse? An unresponsive monopoly, or a costly war of features? Closed software represents a cost that the customer cannot easily reduce in an area that is different from their core competancies. It behaves like taxation from a supplier that does not need to outlay any more to continue reaping the benefits of its invesment, or a set of suppliers that duplicate efforts at the cost of the combined customer base. Open source may provide a third alternative: A cooperative of customers each working to build features they need themselves, and forking when their interests diverge.

People who are interested in open source are often also interested in open standards. Unlike open source, open standards do promote freedom of choice. Unlike open standards, open source does promote post-choice freedoms. Both have a tendancy to promote community building and shared understandings, and both are important to the health of the software industry moving forwards. The worst combination for overall productivity is and will continue to be a closed source product that uses closed data formats and interaction models.

Benjamin

Sun, 2006-Feb-12

Efficient Software at Humbug

We had a good preliminary scoping discussion today at HUMBUG. I caught the end of Pia Waugh's talk, and the linux.conf.au 2006 debreif lead by Clinton Roy. After dinner a few of us clustered around a chalk board at the front of the lecture theatre. A good canvassing of some of the direction issues was had, despite most of it being carried out before Anthony Towns returned from a separate dinner trip.

I was on the chalkboard, and wrote up the Efficent Software Initiative problem statement as follows (excuse my basic inkscape reproduction):

A person's time and their money are often in conflict when working with free software

Libre software is often written gratis, however those who write the software still need to live. They need to feed their families. In short, they need a day job. A lucky few are employed to write open source software for specific companies that use it. Many others are weekend soliders.

We are at a stage in the development of the software industry as a whole where it is usually necessary to be a full time software developer in order to be a particularly useful software developer. It takes a number of years of study and practical application to write good software, and we want our free software to be good. For those not employed to work on open source, their day job will likely include closed source software. Now, let's assume we "win". There isn't any appreciable amount of closed software development going on anymore. Do these guys lose their day jobs, or can they be employed to work on open source software in a new and different way? It is the function of the Efficient Software Initiative to build a base of discussion and experimentation to answer that question. We want to give as many people as possible the chance to work full time on free software, which means giving them an income.

The initial diagram became a little more complicated as we tried to model the software industry that supports open source developers today. We started by drawing links to that all-important money bubble. The goal was to describe where the money is coming from, including where a business that employs open source software develpers ultimately gets its money from. A simplified version of the diagram follows:

Paid open source development is mostly contingent on a genuinely separate business

If free software takes over we lose the ability to charge licence fees to support the developer's day job. I see the support-based business model for software development as weak, also. It may have conflicting goals to that of the production of good quality software with a balanced feature set. Real strength in an open source business model comes from being useful to other industries. That does not necessarily mean outside the world of computers. Google or Yahoo have established business models in the internet search industry, and serving their interests is a way of rooting the money supply available to open source without depending on closed software in the value chain. Other industries and established business models abound, including everything from avionics to resturants. To be productive, an open source industry has to appeal fairly directly to this wide audience. These industries can and should be engaged incrementally and won over with sound business logic.

So if we trace our connections back to the developer, we see two main paths still open for the raising of funds. We could try and get them employed directly by businesses that need software and want their particular interests represented in the development of a project going forwards. Alternatively, we could try and raise money over a wider base from the whole user community. This is a community of individuals and of companies who have an interest and an investment in your project's ongoing success. Your project represents a point at which their interests all overlap. The main things that matter on an encomic level in maintaining the community and successfully drawing contributions from the community are to match your project goals with the community interests and to ensure that individual contributions are returned in spades as a collectively built product.

This is an interesting and important facet to the bean counting motivation for open source in business. Businesses like to deal with the existing software industry. They pay a small fee, and in return they get to make use of the produce from hoards of programmers over several years (if not several decades). Open source makes it possible to do the same thing for business. Even though they contribute only a small amount individually, they each recieve the benefit of the total common contribution. They often don't need to work on a bug that affects them to see it fixed. The effort is often expended by someone else who felt the pain more stronly or immediately. Licences such as the GPL also provide some level of certainty to contributors that what they have collaboratively built won't be taken by any single contributor and turned back into a product that does not benefit them.

The difference between open and closed source approaches to this multiplicative return is that open source is more efficient. It stands to reason that when it is possible to fork a product, there is a lower cost of entry to the business than if you had to rewrite from scratch. For an example, just consider the person-years of effort that have been extended on the GNU Classpath, despite Sun having a perfectly good implementation in their back pocket the whole time. If the barrier to entry is lowered, competition or the potential for it increases. This should lead to a more productive and lower cost industry. The current industry forms monopolies wherever there is a closed file format or undocumented API. An industry remade as open source would see none of that, and the customers should benefit.

Benjamin

Thu, 2006-Feb-09

Patent and Copyright in and Efficent Software Industry

This is the first in a series of articles I may or may not get around to writing over the next week or so to clarify some ideas leading into the Efficient Software Initiative. Some of the ground in this article is covered also in Math You Can't Use, who's sixth chapter I read after drafting. I think this article is a little more focused on the core economic issues and has value standing on its own. Note that Math You Can't Use makes use of the terms "goods-oriented" and "labour-oriented". I use the terms "manufacturing industry" and "service industry" for analogous concepts. Disclaimer: I am a software developer, not an economist.

It is incorrect to say that software is just maths, and is therefore not patentable. Physics and engineering are just maths, so our total understanding of our physical environment is just maths. If that were sufficient justification to prohibit patent, all engineering design would be exempt from the system. We must instead look for an econmic justification for prohibition of patent on software, and I think such a justification can be made.

The patent and copyright systems attempt to solve the same base problem. If I invest a certain monetary value of time and skills into a product development it is important I can recoup my costs and ideally can make a profit. Because my initial investment excluded input from other parties, so should the benefits I reap. I deserve a limited time monopoly on what I produce.

The problem with this reasoning is that it essentially applies to a manufacturing industry. A manufacturing industry is one where an initial investment is made before and throughout a development cycle, before a later cycle allows me to recoup my costs and feed my family. Mass-market software has traditionally been a manufacturing industry, however this is changing. The software industry is remaking itself around services.

Services follow a different business model. Instead of having to recoup costs at a later stage of product development, services acquire funding "in-phase". Software developers will remember that the old ideal for software development was the waterfall model. Requirements definition was followed by design, and then by implementation. After implementation comes sale and the recouping of costs. We have been moving away from that model for many years. It patently does not work for most software devlopement. Instead, we began with spiral models and have ended up with agile or extreme methods. These approaches to software development dictate a cyclic model. You plan a little, you implement a little, you test a little, and you deploy a little. An important economic aspect of these models is the potential for in-phase recoup of costs.

The agile approach goes as far as saying that a customer representative is on your development team. They drive this week's development and have enough insight into the development as a whole to help make go/no-go decisions for the project. If funding should stop, that is ok. The customer will have selected their most important features for early implementation and will have discontinued the project only as diminishing returns make continuing uneconomic. Because the funding was in-phase, all costs have been covered for the developers. There is nothing that needs to be recouped after the implementation phase. The roles of customer and investor have merged.

So if software is a service rather than a manufacturing industry, if software development does not have to recoup its costs out of phase, then it stands to reason that patent and traditional copyright do not benefit the industry as a whole. The barrier to entry that both systems create to new entrants (particularly the patent system) is a deliberately anti-competative feature and an anti-productive one. They will have an overall negative economic effect when there is no corresponding productivity increase created by the out of phase recoup of costs. Given the software industry's growing importance to the world's economy, it stands to reason that both systems need to die for software. Software patents should be the first to go.

Patents and copyright do not make sense in a services industry. If I patent the ideas behind a dog-walking industry, customers cannot be benefited. Instead, I will be reducing the quality and scope of services available to them and increasing their costs. I should be putting my energies into running my service efficiently, and outsource that which is not my core competancy. If new ideas will make me more efficient I'll spend the research and development dollars to produce them, but the funding will come from the business I run and not from the sale of these ideas to other dog walkers. The same analogy applies to the software industry. If I patent the ideas behind internet search or behind the playing of video files, I am reducing the quality and scope of services available to computer users. I should focus on providing great search and video services to my users and use those dollars to fund any necessary research. If my service is providing software to make a particular business model more efficient, I should focus on meeting customer demand for efficiency improvements. I should not be trying to sell them a product. When the industry is built around in-phase cost structures the patent system only acts to prevent my competitors from matching my performance in ways that lock them out of the whole industry and provide an overall poorer marketplace for the services themselves. Economically speaking, it is the services that count. It isn't the paltry research industry sector.

Patents are intended for industries where it takes a number of years to develop a product and the costs must be recouped later. This works in drug industries. It may even work in research sectors of the computer industry. If I labour away at creating a new video codec system that greatly improves quality for the same computational and bandwidth requirements as convetional technology, I need a way to recoup my spent youth. If such an incentive is not given, the improvements may not be made. However, I would argue that the bulk of the industry will be services-based. Some segments will be hurt by the transition, but for the most part a greater level of innovation will be recorded rather than a lesser one. Segments based around services will work to fund other segments in order to improve their own services offering. This will occur naturally as required, and will also be in the form of in-phase funding.

The change is already beginning, but at the same time existing players in the sofware market are rallying to try and keep competition out. The efficiency of the software industry will suffer if they succeed, and thus the efficiency of the world economy will also suffer.

Benjamin

Sun, 2006-Jan-29

Launch of the Efficient Software Project Plan, and Call for Participation

Free Software is Efficient Software, but Efficient Software won't come into being overnight. The Efficient Software Initiative has hit another (small) milestone, in releasing its first project plan:

  1. Conceive of idea
  2. Promote idea via blogs and other viral media (we are here)
  3. Build a base of discussion between individuals to agree on the right models for putting money into open source, including solid discussions on whether it is necessary or even useful
  4. Build a base of three or more people across disciplines to begin developing concrete policy, business plans, and site mockups. Continue promoting and building awareness until at least one slashdot effect is noted.
  5. Use wider promotion to get feedback on initial discussions and established policies. Learn which parts are defensible and which are not. Adjust as required, and begin building a core implementation group. This group will have access to sufficient funds between them to make the business a reality, and will likely have strong overlap with earlier core team.
  6. Launch business based on developed plans

This plan is likely to require at least twelve months of concerted attention, and will likely strech over several years if successful. The ultimate goal is to build not only a business, but a business model that can be replicated into an industry. The goal is to fund open source development and directly demonstrate the cost differential to closed source software. If you have any comments or feedback on the plan, please let us know via the mailing list.

With it comes a call for participation on the project front page:

You can help us get off the ground!

Do you have a background in law?
Help us understand the Legal Implications of the Business
Do you have a background in business or accounting?
Help us understand the Profit Implications of the Business
Do you have a background in website design?
Help us develop possible website designs (this will help everyone stay grounded!)
Do you have a background in running large websites?
Help us understand how to set everything up
Do you have roots in open source communities?
Talk about Efficient Software! Get feedback! Bring people on board with us, and help us come on board with them

So, why should you get involved? Who am I, and why should you trust me to get a project like this off the ground? Well, I am not asking anyone to trust me at this stage. We are in the discussion stages only. The next stage is to develop a core group of people to push things forward. Perhaps that group will include you. If you really want to know about me, well... here I am:

I have lived in Brisbane, Queensland for most of my life now. I graduated from the University of Queensland with a computer science honours degree back in 1998. Since the time of my course I have been a involved in HUMBUG. My open source credentials are not strong. I have done a few bits here and there at the edges of various communities. I was involved with the sqlite community for a time and have a technical editor credit on the book of the same name. I have recently been involved with the microformats community, particularly with recent hAtom developments. I'm a RESTafarian. I have used debian linux as my sole desktop install for at least five years, and saw the a.out to ELF transition. I'm a gnome user and advocate. I'm something of a GPL biggot, or at least it is still my default license. I work in control systems, and have spent the last few years trying to "figure out" the web. I'm a protocols and bits and bytes kind of guy. I use the terms "open source" and "free software" interchangably. I'm afraid of license incompatability in the open source world leading to a freedom that is no larger than what we see in closed source software. I'm a father to be, with less than seven weeks to go until maybe I won't have much time to do anything anymore. I'm a full-time sofware developer who does a lot of work in C++. I work more hours at my job than I am contractually obliged to, so often don't have time for other things. I have a failed open source accounting package behind me, where I discovered for the first time that user interfaces are hard and boring and thankless things. That project came out of my involvement with gnucash, which I still think would be easier to replace than fix. I think python needs static checking because the most common bug in computer software is not the off-by-one error, but the simple typo. I haven't used ruby on rails because I think that any framework that requires I learn a whole new language needs a few years for the hype to cool down before it gets interesting. I render my blog statically using blosxom, partially because I didn't have the tools at hand for dynamic pages using my old hosting arrangements, but partially because I'm more comfortable authoring from behind my firewall and not allowing submissions from anywhere on the web. I used to be a late sleeper, but have taken the advice of Steve Pavlina and have now gotten up at 6am for the last six weeks straight. With all the extra time I still haven't mowed the lawn. I'm not perfect, and I certainly don't have all the skills required to create a new open source business model on my own. That is why I want this to be a community. I know how effective a community can be at reaching a common goal, and I think that the Efficient Software Initiative represents a goal that many can share.

So sign up for the mailing list. Get into some angry conversations about how open source doesn't need an income stream, or how selling maintenence is the perfect way to make money from open source software. I want to have the discussion about whether the kind of business Efficient Software is trying to build will create a new kind of free software market, or whether it just puts a middleman in the way of an already free market. Let's have the discussions and see where we end up. Efficient Software is something I can get excited about, and I hope it is something you'll be able to get a bit excited about, too.

Benjamin

Sat, 2006-Jan-28

Internet-scale Client Failover

Failover is the process of electing a new piece of hardware to take over the role of a failed piece of hardware (or sometimes software), and the process of bringing everyone on board with the new management structure. Detecting failure and electing a new master are not hard problems. Telling everyone about it is hard. You can attack the problem at various levels. You can have the new master take over the IP of the old and broadcast a new arp reply to take over from the MAC address. You can even have the new master take over the IP and MAC addres of the old. If new and old are not on the same subnet, you can try to solve the problem through DNS. The trouble with all of these approaches is that while they solve the problem for new clients that may come along, they don't solve the problem for clients with existing cached DNS entries or existing TCP/IP connections.

Imagine you are a client app, and you have sent a HTTP request to the server. The server fails over, and a new piece of hardware is now serving that IP address. You can still ping it. You can still see it in DNS. The problem is, it doesn't know about your TCP/IP connection to it, or the connection's "waiting for HTTP response" state. Until a new TCP/IP packet associated with the connection hits the new server it won't know you are there. Only when that happens and it returns a packet to that effect will the client learn its connection state is not reflected by the server side. Such a packet won't usually be generated until new request data is sent by the client, and often that just won't ever happen.

Under high load conditions clients should wait patiently to avoid putting extra strain on the server. If a client knows that a response will eventually be forthcoming it should be willing to wait for as long as it takes to generate the response. With the possibility of failover, the problem is that a client cannot know whether the server state reflects its own and cannot know whether a response really will be forthcoming or not. How often it must sample the remote state is determined by the desired failover time. In industrial applications the time may be as low as four or two seconds, and sampling must take place at a rate several times as quickly to allow for lost packets. If sampling is not possible the desired failover time represents the maximum time a server has to respond to its clients, plus network latency. Another means must be used to return the results of processing if any single request takes longer. Clients must use the desired failover time as their request timeout.

If you take the short request route, HTTP permits you to return 202 Accepted to indicate a request has been accepted for processing but without indicating success or failure of the request. If this were used as a matter of course, conventions could be set up to return the HTTP response via a request back to a call-back url. Alternatively, the response could be modelled as a resource on the server which is periodically polled by the client until it exhibits a success or failure status. Neither of these approaches is directly supported by today's browser software, however the latter could be performed using a little meta-refresh magic.

You may not have sufficient information at the application level to support sampling at the TCP/IP level. You would need to know the current sequence numbers of the stack in order to generate a packet that would be rejected by the server in an appropriate way. In practice what you need is a closer vantage point. Someone who is close in terms of network topology to both the old and the new master can easily tell when a failover occurs and publish that information for clients to monitor. On the face of it this just moving the problem around, however a specialised service can more easily ensure that it doesn't ever spend a long time responding to requests. This allows us to employ the techniques which rely on quick responses.

Like the state of http subscriptions, the state of http requests must be sampled if a client is to wait indefinately for a response. How long it should wait depends on the client's service guarantees, and has little to do with what the server considers an appropriate timeframe. Nevertheless, the client's demands put hard limits on the profile of behaviour acceptable on the server side. In subscription the server can simply renew whenever a renew is requested of it, and time a subscription out after a long period. It seems that the handling of a simple request/response couples clients and servers together more closely than even a subscription does, because of the hard limits client timeout puts onto the server side.

Benjamin

Mon, 2006-Jan-16

On XML Language Design

Just when I'm starting to think seriously about how to fit a particular data schema into xml, Tim Bray is writing on the same subject. His advice is cheifly, "don't invent a new language... use xhtml, docbook, odf, ubl, or atom". His advice continues with "put more effort into extensibility than everything else".

His column was picked up all over the web, including by Danny Ayers. He dives into discussion about how to build an RDF model, rather than an XML language:

When working with RDF, my current feeling (could be wrong ;-) is that in most cases it’s probably best to initially make up afresh a new representation that matches the domain model as closely as possible(/appropriate). Only then start looking to replacing the new terms with established ones with matching semantics. But don’t see reusing things as more important than getting an (appropriately) accurate model. (Different approaches are likely to be better for different cases, but as a loose guide I think this works.)

I've been following more of the Tim/microformats approach, which is to start with an established model and extend minimally. I think Tim's stated advantages to this approach are compelling, with the increased likelyhood that software you didn't write will understand your input. When your machine interfaces to my machine, I want both to require minimal change in order for one to understand the other. I'm not sure the same advantages are available to an rdf schema that simply borrows terms from established vocabularies. Borrowing predicate terms and semantics is useful, but the most useful overlaps between schemas will be terms for specific subject types and instances.

From Tim,

There are two completely different (and fairly incompatible) ways of thinking about language invention. The first, which I’ll call syntax-centric, focuses on the language itself: what the tags and attributes are, and which can contain which, and what order they have to be in, and (even more important) on the human-readable prose that describes what they mean and what software ought to do with them. The second approach, which I’ll call model-centric, focuses on agreeing on a formal model of the data objects which captures as much of their semantics as possible; then the details of the language should fall out.

I think I fall on Tim's syntax-centric side of the fence. I understand the utility of defining a model as part of language design, however I think this will rarely be the model that software speaking the new language will use internally. I think that any software that actually wants to do anything with documents in your language will transform the data into their own internal representation. Sometimes this will be so that they can support more than one langauge. Liferea understands rss, atom, and a number of other formats. Sometimes it will be related to the way a program maps your data onto it's graphical elements. It may be more useful to refer to a list or map than a graph.

I think a trap one could easily fall into with rdf is to think that the model is important and the syntax is not. This changes a syntax->xml-dom-model->internal-model translation in an app that implements the language to a syntax->xml-dom-model->rdf-graph-model->internal-model translation. With the variety of possible rdf encodings (even just considering the variation allowed for xml) it isn't really possible to parse an xml document based on its rdf schema. It must first be handled by rdf-specific libraries, then transformed. I think that transforming from the lists and maps and hierarchy representation of an XML dom is typically easier than transforming from the graphs and triples representation of an RDF model in code.

From Danny:

This [starting with your own model, then seeing which terms you can exchange for more general ones already defined] is generally the opposite of what Tim suggests for XML languages, but there is a significant difference. Any two (or however many) RDF vocabularies/models/syntaxes can be used together and there will be a common interpretation semantics. Versioning is pretty well built in through schema annotations (esp. with OWL).

There isn’t a standard common interpretation semantics for XML beyond the implied containership structure. The syntax may be mixable (using XML namespaces and/or MustIgnore) but not interpretable in the general case.

Extensibility has to be built into the host language in XML. It should be possible to add extension elements with a defined meaning for anyone who understands both the host language and the extension. I don't think aggregation is an important concept yet for XML, although if Google Base proves useful I may start to revise that view. I think that aggregation is presently still something you do from the perspective of a particular host language or application domain, such as atom or "syndication". From that perspective there is currently little value in common interpretation semantics for XML, as it will only be parsed by software that understands the specific XML semantics.

I have not yet seen a use I consider compelling for mustUnderstand to support extensibility, however I am completely convinced by the need for mustIgnore semantics. I am also convinced that one should start with established technologies and extend them minimally wherever there is a good overlap. While this might not always be possible, I think it will be in a reasonable proportion of cases.

Benjamin

Sun, 2006-Jan-15

The Efficient Software Mailing List

Subject: [es] Free Software is Efficient Software

The Efficient Software initiative is growing slowly, but surely. We now have a wiki <http://efficientsoftware.pbwiki.com/>, an irc channel (#efficientsoftware on irc.freenode.net), and this mailing list. The post address is efficientsoftware at rbach.priv.at, and archives can be found at <http://rbach.priv.at/Lists/Archive/efficientsoftware/>.

We are looking forward to making positive connections with software projects, as well as lawyers, businesspeople, accountants, web site maintainers, and many more. To make this thing a reality we need to form a diverse community with a broad skill set.

The thing that will bring us all together is a desire for more efficient software production and maintanence. We can undercut the current players in the industry, and make a profit doing it. We can turn the weekend free software soldiers into a lean regular army with full time pay. We can match customer needs and pain to a speedy resolution. These are the goals of the Efficient Software Initiative.

Welcome to the community!

To join the mailing list, see the EfficientSoftware Info Page. A big thankyou goes out to Robert Bachmann for offering to host the list.

Benjamin

Thu, 2006-Jan-12

Machine Microformats

I find microformats appealing. They solve the problem of putting data on the web simply without having to create extra files at extra urls and provide extra links to go and find the files. The data is in the same page as the human-readable content you provide. Like HTML itself, micrformats allow you put your own custom data into the gaps between the standard data. They effectively plug a gap in the semantic spectrum between univerally-applicable and individually-applicable terms. I have been working on various data formats during my first week back from annual leave, and the question has occured to me: "How do I create machine-only data that plugs the gap in a similar way?".

It doesn't make sense to use microformats directly in a machine-only environment. They are designed for humans first and machines second. However, it does make some sense to try and learn the lessons of html and microformats. When XML became the way people did machine to machine comms a strange thing happened. Instead of learning from html and other successful sgml applications, we jumped straight into strongly-typed thinking. HTML allows new elements to be added to its schema implicitly with "must-ignore" semantics for anything a parser does not understand. This allows for forwards-compability of data structures. New elements and attributes can be added to represent new things without breaking the models that existing parsers use. Instead of following this example in XML we defined schemas that do not assume must-ignore semantics. We defined namespaces, and schema versions. When we introduce version 3.0 of our schema, we expect existing parsers to discard the data and raise an error. This is the way we're used to doing things in the world of remote procedure calls and baseclasses. In fact, it is the wrong way.

My approach so far has been to think of an xml document as a simple tree. A parser should follow the tree down as far for as it knows how to interpret the data, and should ignore data it does not understand. Following the microformat lead, I'm attempting to reuse terminology from existing standards before inventing my own. The data I've been presenting is time-oriented, so most terms and structure have been borrowed from iCalendar. The general theory is that it should be possible to represent universal (cross-industry, cross-application), general (cross-application), and local (application-specific) data in a freely-mixed way. Where there is a general term that could be used instead of a local term, you use it. Where there is a universal term that could be used instead of a general one, you do. The further left you push things, the more parsers will understand all of your data.

At present, I am also following the microformat lead of not jumping into the world of namespaces. I am still not convinced at this stage that they are beneficial. One possible end-point for this development would be to use no namespace for universal terms, and progressively more precise namespaces for general and local terms. Microformats themselves only deal in universal terms so they should be able to continue to get away without using namespaces.

By allowing universal and local terms to mingle freely it is possible to make use of universal terms wherever they apply. I suppose this has been the vision of rdf all along. In recent years the semantic web seems to have somehow transformed into an attempt to invent a new prolog, but I think a view of the semantic web as a meeting place for universal and local terms is of more immediate use. I think it would be useful to forget about rdf schemas for the most part and just refer to traditional standards documentation such as rfcs when dealing with ontology. I think it would be useful to forget about trying to aggregate rdf data for now, and think about a single format for the data rather than about multiple rdf representations. Perhaps thinking less about the data model rdf provides and thinking more about a meeting of semantic terms would make rdf work for the people it has so far disenfranchised.

Benjamin

Thu, 2006-Jan-05

Efficient Software FAQ

Efficient Software has launched its FAQ, currently still on the main page. From the wiki:

Why start this initiative?

Too much money is being funnelled into a wasteful closed source software industry. Initially it is investors money, but then customers pay and pay. Profits to major software companies are uncompetatively high compared to other industries. We want to funnel money away from the wasteful industry and towards a more productive system of software development. Free software can be developed, forked, changed, and distributed without waiting on unresponsive vendors. Free software is open to levels of competition that cannot be matched by the closed source world. Free software contributors don't have to be on any payroll in order to fix the problems they care about. Free software does not maintain the artificial divide between customers and investors. The people who contribute to the development of a free software project are its customers, and all customers benefit when bugs are fixed or enhacements are carried out.

What do projects have to gain?

Our goal is to increase the money supply to projects. Money is not a necessary or sufficient factor in developing free software, but it cannot hurt. Projects often accept donations from users, but it is unclear how much users should give or what their motiviations are. Efficient Software aims to tie a contribution to services rendered. Whether the services are rendered immediately or a year from now is inconsequential. Efficient Software maintains a money supply that can be tapped by projects when they fix the nominated bugs.

Won't this drive the wrong bugs to be fixed?

Projects will nominate which state a bug has to be in for Efficient Software to accept payment. Bugs whose fix would contradict project goals should never be put into eligable states and will never recieve contributions. One way of thinking about the money involved is as bugzilla votes. The difference is that modern world currencies tend to have low inflation rates and limited supply. There is evidence across a number of fields that when people commit money to a goal they tend to make decisions more carefully, even if the amount is small. If your project's money supply has a wide base, the dollar value next to each bug should be a reasonble estimate of the value to users in getting it fixed. This information system alone could be worth your while becoming involved.

What should projects do with the money?

Efficient Software was conceived around the idea that projects would pay developers for the fixes they contribute through a merit-based mechanism. We have some thoughts about how this could work in practice, but we will need to develop them over time. In the end, projects are required to make their own "opt in" decision with Efficient Software and their own decision about how to distribute the money. This policy will be made available to contributors in case it may affect their investment decisions.

What if a project marks bugs verified just to get a payout?

Projects are free to mark bugs verfied under their own policy guidelines. We do not get involved, except to publish those guidelines to investors alongside other policies. However, beware that any investor who has contributed any amount towards a bug will have their say on whether they believe the resolution was in the overall sense positive, netural, or negative. Cumulative scores will be available to potential investors in case this may affect their investment decision.

Benamin

Tue, 2006-Jan-03

The Efficient Software Initiative

The Efficient Software wiki has been launched. From the wiki:

This wiki is designed to concentrate efforts to create a new software industry. It is an industry that does not rely on delivering parallel services in order to fund software development or on the payment of license fees, but instead yields return as software is developed. It leverages the ability of a project's own bug database to bring customer and developer together. Efficient Software is intended to become a business that helps put money into the model. Its business model will be developed in the open and will be free for anyone to adopt. The product is more important than any one implementation of the business model. Cooperation is good, but the threat of competition is healthy too.

The fundamental goal of the Efficient Software initiative is to increase the money available to free software projects and free software developers. Contributors currently make donations to software projects, but it is unclear how much they should give and what their motivation is beyond "these are nice guys, I like them". Efficient Software is designed to create a market in which customers and projects can participate to meet customer needs and project needs.

Have at it!

Also, I would love to hear from anyone who is prepared to host a mailing list for the initiative.

Update: The wiki now has edit capability, and there is an irc channel. Join #efficientsoftware on irc.freenode.net.

Benjamin