Sound advice - blog

Tales from the homeworld

My current feeds

Sat, 2005-Dec-31

Paid Free Software

hAtom is currently stalled, awaiting a meeting of minds on the subject of css class names. I have been using the progress of hAtom and the related hAtom2Atom.xsl (Luke's repo, Robert's repo) to soak up excess energy over the first two weeks of my three week holiday. With that on hold I have energy to burn on two further projects. One is my old attempt at building a home accounting package. The other is trying to reason out a business model for paid free software.

Free Software is Efficient Software

Free software is a movement where the users of a piece of software are also its contributors. Contributors are secure in knowing the benefits they provide for others by their contribution will flow back to them in the form of contributions from others. With a reciprocal license such as the GPL a contributor knows their code won't be built upon by their competitors in ways that give their competitors advantage against them.

If everyone contributes to the same baseline the community of contributors benefits. The interests of the majority of contributors are served the majority of the time. When the interests of the group diverge, forking appears. These forks are also produced efficiently by body of contributors who remain on each side of the forked divide. Noone has to start from scratch. Everyone still has access to the original code base. Patches can still be transferred between the forks in areas where common need is still felt. If the reasons for divergence disappear, the contributor pool can even fold back together and form a common base once again.

Free software is not just a hippy red socialist cult. It is a free market of excess developer hours. Developers contribute, and gain benefits for themselves and for others. Projects are run efficiently along lines contributors define either explicitly or implicitly. The fact that so much free software has been developed and used based primarily on the contribution of individuals' spare time is a testament to how much can be achived with so little input.

Contributors of Money

Contributions in the free software world are almost exclusively made in terms of working code, or theoretical groundings for working code. There are areas such as translation for internationalisation and web site design and maintenence that come into it as well... but this is primarily a market for excess skills that are not exhausted in the contributor's day job. I see free software's efficiency as a sign that it can and should displace closed software development models in a wide range of fields. One way in which this might be accelerated is if a greater excess of skills could be produced. Perhaps your favourite web server would get the feature you want it to have faster if you paid to give a software developer an extra few hours off their day job.

There are reasons to be wary of accepting money for free software, but I think we can work around them. Full time paid staff on a free software development can gratis developers away, so we should pay our developers in proportion to the needs of money contributors that they meet. Boiled down to the essence of things, a money contributor should pledge to and pay the developer who fixes the bugs this contributor wants fixed. That's capitalism at work!

Running a Macro-money system (the business model)

In practice, the issue of resolving which individual should be paid what amount is more complex than matching a contributor to a developer. Perhaps the reviewer or the applier of the patch should get a slice. Perhaps the project itself should, or the verifier of the bug. I delved into some of the issues invovled in my earlier article, "Open Source Capitalism". I think there is a business to be made here, but it isn't going to be done by making decisions at this level. This is a project decision that should be made by projects. The business model at our global level is simple:

  1. Set up an eBay-style website
  2. Allow free software projects to sign on to be part of it (sign some up before your launch!)
  3. Allow followers of the registered projects to identify bugs, and attach money to them
  4. Monitor the bug tracking systems of each project, and pay the project out when the bug is verified

Two sources of income are available, percentages and earned interest. It seems reasonable to take a small cut of any money transferred through the site to cover expenses. Requiring money to be paid up front by contributors has several advantages, including the project doesn't achieve its goal only to find no reward waiting for it. The benefit to us, however, is the income stream that can be derived through interest earnt on unpaid contributions. Depending on the average hold time of a contributed dollar this could account for a reasonable portion of costs, and if excessive might have to be paid back in part to projects. If the business model is workable at all, keeping costs low from the outset will stave off competition and build project and contributor loyalty.

Projects define policy as to who (if anyone) money acquired in this way is distributed to. Projects also define policy on which bugs pledges can be placed upon (typically, only accepted bugs would be allowed), and when a bug is considered "verified". Policies must be lodged with the website and available for contributor perusal before any contributions are made. Contributors who feel their bugs have been resolved well can leave positive feedback. Contributors who feel their bugs have been resolved poorly or not resolved can leave negative feedback. These ratings are designed to ensure projects provide clear policy and stick to it, however all contributions are non-refundable barring special circumstances.

I don't know the legal status of all of this, particularly the tax implications or export control implications of dealing with various countries around the globe. Expert advice would certainly be required and money invested up front. Community outrage at the suggestion would also be a bad thing. Discussions and negotitions should occur early and the project probably can't proceed without gnome, kde, and mozilla all signed up before you get into it. Another angle of attack would be to sign up sourceforge, however they may see the system as a competitor of sorts to their paypal donations revenue stream. If you were to sign them up I think payments to sf projects would have to go via the sf paypal.

Conclusion

Consider this an open source idea. I would love to be involved in making something like this a reality, but I don't have the resources to do it myself. In fact, I don't necessarily bring any useful expertise or contacts to the table. Nevertheless, if you are in a position to make something like this happen I would like to hear about it. I might want to buy a share, if nothing else. Dear Lazyweb: Does anyone have a wiki space that could be used to develop this concept?

Benjamin

Wed, 2005-Dec-21

RESTful Blogging

Tim Bray and Ken MacLeod disagree (Tim's view, Ken's view) on how to define a HTTP for posting, updating, and deleting blog entries. The two approaches are to use GET and POST to do everything on one resource, or to use GET, PUT and DELETE on individual entry resources. They are both horribly wrong. The correct approach is this:

GET on feed resource
GETs the feed
PUT on feed resource
Replaces the entire feed with a new feed representation
DELETE on feed resource
Deletes the feed
POST on feed resource
Creates a new entry, and returns its URI in the Location header
GET on entry resource
GETs the entry
PUT on entry resource
Replaces the entry with a new entry representation
DELETE on entry resource
Deletes the entry
POST on entry resource
Creates a new comment, and returns its URI in the Location header
GET on comment resource
GETs the comment
PUT on comment resource
Replaces the comment with a new comment representation
DELETE on comment resource
Deletes the comment
POST on comment resource
Creates a new sub-comment, and returns its URI in the Location header

Do you see the pattern? Everything you operate on directly gets a URI. That thing can be replaced by PUTting its new value. It can DELETEd using the obvious method. You can GET it. If it is a container for children it supports the POST of a new child. You don't confuse PUT and POST, and everyone is happy.

I don't know what Tim thinks is clear and obvious, and I don't know what Ken thinks is REST, but isn't this both?

In fairness to both parties their original blog entries both date back to 2003. The reason this has come across my desk is a "REST"-tagged del.icio.us link to this intertwingly.net poll.

Benjamin

Wed, 2005-Dec-21

Integrate, don't go Orange!

Tim Bray has adopted the orange feed link on his website, but isn't happy about it:

Despite that, it is a bad idea; a temporary measure at best. Based on recent experience with my Mom and another Mac newbie, this whole feed-reading thing ain’t gonna become mainstream until it’s really really integrated.

emphais Tim's. Tim suggests having a standard button somewhere on the browser for a one-click subscribe. I take a diffent tact. I think we should be going really really really integrated. After all, what is a feed reader, except a smart web browser?

When presented with a web page that is associated with atom data a real web browser could keep track of which ones you have read and mark them as such. Given a list of feeds to subscribe to, it could periodically cache them like a feed reader does so that you can read them without waiting. Perhaps you could even view poritions of the sites using a standard font and layout engine like the feed readers do. There really isn't any reason why the two applications have to be separate. Feed readers today are used fairly differently to web browsers, but it needn't necessarily be that way forever. I can see more and more of the web being consumed in this feed-centric way of viewing, and keeping up with changes.

One of the technical challenges today is that an atom feed and a web site are in different formats. It is difficult to relate an atom entry with a section of a web page. Someday soon, however, hAtom of microformats.org fame will reach maturity. This is an evolution of the Atom specification that is embeddable within a web page. A browser will soon be able to identify blog entries within a page, and a feed reader will soon be able to use the same page as your web browser sees as input.

So let's not talk about integrated web browsing and feed reading, then in the same breath about the two being provided by separate applications. The time of separation will one day come to a close. The web is changing in a way that is becoming more semantic, and I for one see the web browser staying near the forefront of that change. What was not possible a few years ago when browser innovation had gone stale is now becoming a reality again: An interative web. A read/write web. A web 2.0? I think we will experiment with new semantics in feed readers and other new ways of seeing the web, but that these will eventually folded back into the one true web. The data formats will be folded back, and eventually so will the applications themselves.

Benjamin

Sun, 2005-Dec-11

The Semantic Spectrum

In software we work with a spectrum of semantics, from the general to the application specific. General semantics gives you a big audience and user base but can be lacking in detail. Application-specific semantics explain everything you need to know in a dialect only a few software components can speak. This is the divide between Internet technologies and Enterprise software.

Encapsulating the bad

In the beginning we worked with machine code. We abstracted up through assembly languages and into structured programming techniques. Structured programming is the use of loops and function calls that allowed us to decompose a linear problem into a sequence of repeating steps. Structured programming ruled the roost until we found that when function 'A' and function 'B' both operate on the same data structures, allowing them to be modified by different programmers tended to break our semantic data models.

Object-orientation added a new level of abstraction, but preserved the structured model for operations that occur within an object. It took the new approach of encapsulating and building on the structured programming layer below it rather than trying to create an entirely new abstraction. Object orientation allowed us to decompose software in new ways (that's the technique rather than any claimed O-O language). We could describe the semantics of an object separately to its implementation, and could even share semantics between differing implementations. The world was peachy. That is, until we found that corba doesn't work on the Internet.

Corba on the Internet

Corba was an excellent attempt to extend the Object-Orientated model to the network. It was a binary format, and some claim that is the reason it failed to gain traction. Others blame the bogging down in standardisation committees. Two technologies exploded in use on the Internet. The use of XML documents to describe ad hoc semantics was a powerful groundswell, however the real kicker was always the web server and web browser.

What was the problem? Why wasn't the Object-Oriented model working? Why weren't people browsing the web with a CORBA browser instead?

I think it is a question of sematics. Object-Orientation ties down semantics in interface classes and other tightly-defined constructs. This leads to problems both with evolvability and with applicability.

Evolvability

Tightly-defined interface classes support efficient programming models well, but this seems to have been at the cost of evolvability. Both HTTP and HTML have must-ignore semantics attached whenever software fails to understand headers, tags, or values. This means that new semantics can be introduced without breaking backwards-compatability, so long as you aren't relying on those semantics being understood by absolutely everone. In terms of Object-Orientation this is like allowing an interface class to have new methods added and called without breaking binary compatability. The use of XML gives developers a tool to help take this experience on board and apply it to their own software, but there is a bigger picture. XML has not been particularly successful on ther Internet yet, either. To see success we must look at that web browser.

Applicability

A web browser turns our interface class inside out. Instead of communicating application semantics it is based on semantics with a much wider applicability: presentation. The web became successful because an evolvable uniform interface has been available to transport presentation semantics that are good enough to conduct trade and transfer information between a machine and a human.

Looking at the early web it might be reasonable to conclude that general semantics need to be in the form of presentation. This could be modelled as a baseclass with a single method: "renderToUser(Document d)". However, this early concept as started to evolve in curious ways. The semantic xhtml movement has started to hit its mark. The "strict" versions of html 4.1 and xhtml 1.0 shun any kind of presentation markup. Instead, they focus on the structure of a html document and leave presentation details to css. This has benefits for a range of users. Speech synthesizer software is likely to be less confused when it sees a semantic html document, improving accessability. Devices with limited graphical capability may alter their rendering tehcniques. Search engines may also find the site easier to navigate and process.

We can see in the web that presentation semantics are widely applicable, and this contributes to the success of the web. To see widely applicable non-presentation semantics we have to move above the level of simple semantic xhtml into the world of microformats or outside of the html world completely. We already see widely applicable semantics emerging out of formats like rss and atom. They move beyond pure presentation and into usefully specific and generaly-applicable semantics. This allows for innovative uses such as podcasting.

Worlds apart

The semantics of html or atom and the semantics of your nearest Object-Oriented interface class are light years apart from each other, but I think if we can all learn each other's lessons we'll end up somewhere in the middle together. On one hand we have children who grew up in an Object-Orientated mindset. These folk start from a point of rigidly-defined application-specific semantics and try to make their interface classes widely applicable enough to be useful for new things. On the other side we have children who group up in the mindset of the web. They are starting from point of widely applicable and general tools and try to make their data formats semantically rich enough to be useful for new things. Those on our left created SOAP. Those on our right created microformats. Somewhere in the middle we have the old school RDF semantic web folk. These guys created a model of semantics and haven't really managed to take things any further. I think this is because they solve neither the application-specific semantics problems nor ther generally-applicable presentation problems. Without a foothold in either camp they can act as independent umpire, but have yet to really make their own mark.

Conclusion

It looks like the dream of a semantic web is a long way off. It isn't because building a mathematical model of knowledge is insovable. Good inroads have been made by semwebbers. It's just that it isn't useful in and of itself, at least not today. The things that are useful are the two extremes of web browsers and of tightly-coupled object-oriented programming models. Both are proven, but neither defines a semantic web. The trouble is the dual goals of having general semantics and useful semantics are usually at odds with each other. The places that these goals meet are not in ivory tower owl models, but in real application domains. Without a problem to solve there can be no useful semantics. Without a problem that many people face there can be no general semantics. Over the next ten years building the semantic web will be a process of finding widely-applicable problems and solving them. It will require legwork more than the development of philosophy, and people will need be in the loop for most problem domains. True machine learning and useful machine to machine interaction are still the domain of Artificial Intelligence research and won't come into being until we have convincingly solved the human problems first.

Benjamin

Sat, 2005-Nov-19

HTTP in Control Systems

HTTP may not be the first protocol that comes to mind when you think SCADA, or when you think of other kinds of control systems. Even the Internet Protocol is not a traditional SCADA component. SCADA traditionally works of good old serial or radio communications with field devices, and uses specialised protocols that keep bandwidth usage to an absolute minimum. SCADA has two sides, though, and I don't just mean the "Supervisory Control" and the "Data Acquisition" sides. A SCADA system is an information concentration system for operational control of your plant. Having already gotten your information into a concentrated form and place, it makes sense to feed summaries of that data into other systems. In the old parlence of the corporation I happen to work for this was called "Sensor to Boardroom".

One of my drivers in trying to understand some of the characteristics of the web as a distributed architecture has been in trying to expose the data of a SCADA system to other ad hoc systems that may need to utilise SCADA data. SCADA has also come a long way over the years, and now stands more for integration of operational data from various sources than simple plant control. It makes sense to me to think about whether the ways SCADA might expose its data to other systems may also work within a SCADA system composed of different parts. We're in the land of ethernet here, and fast processors. Using a more heavy-weight protocol such as HTTP shouldn't be a concern from the performance perspective, but what else might we have to consider?

Let's draw out a very simple model of a SCADA system. In it we have two server machines running redundantly, plus one client machine seeking information from the servers. This model is effectively replicated over and over for different services and extra clients. I'll quickly summarise some possible issues and work through them one by one:

  1. Timely arrival of data
  2. Deciding who to ask
  3. Quick failover between server machines
  4. Dealing with redundant networks

Timely Data

When I use the word timely, I mean that our client would not get data that is any fresher by polling rapidly. The simplest implementation of this requirement would be... well... to poll rapidly. However, this loads the network and all CPUs unnecessarily and should be avoided in order to maintain adequate system performance. Timely arrival of data in the SCADA world is all about subscription, either ad hoc or preconfigured. I have worked fairly extensively on the appropriate models for this. A client requests subscription of a server. The subscription is periodically renewed and may eventually be deleted. While the subscription is active it delivers state updates to a client URL over some appropriate protocol. Easy. The complications start to appear in the next few points.

Who is the Master?

Deciding who to ask for subscriptions and other services is not as simple as you might think. You could use DNS (or a DNS-like service) in one of two ways. You could use static records, or your could change your records as the availability of servers changes. Dynamic updates would work through some DNS updater application running on one or more machines. It would detect the failure of one host, and nominate the other as the IP address to connect to for your service. Doing it dynamically has a problem that you're working from pretty much a single point of view. What you as the dynamic DNS modifier sees may not be the same as what all clients see. In addition you have the basic problem of the static DNS: Where do you host it? In SCADA everything has to be redundant and robust against failure. No downtime is acceptable. The static approach also pushes the failure detection problem to clients, which may be a problem they aren't capable of solving due to their inherent "dumb" generic functionality.

Rather than solving the problem at the application level you could rely on IP-level failover, however this works best when machines are situated on the same subnet. It becomes more complex to design when main and backup servers are situated in separate control centres for disaster recovery.

Whichever way you turn there are issues. My current direction is to use static DNS (or eqivalent) that specifies all IP addresses that are or may be relevant for the name. Each server should forward requests onto the main if it is not currently master, meaning that it doesn't matter which one is chosen when both servers are up (apart from a slight additonal lag should the wrong server be chosen). Clients should connect to all IP addresses simultaneously if they want to get their request through quickly when one or more servers are down. They should submit their request to the first connected IP, and be prepared to retry on failure to get their message through. TCP/IP has timeouts tuned for operating over the Internet, but these kinds of interactions between clients and servers in the same network are typically much faster. It may be important to ping hosts you have connections to in order to ensure they are still responsive.

It would be nice if TCP/IP timeouts could be tuned more finely. Most operating systems allow tuning of the entire system's connections. Few support tuning on a per-connection basis. If I know the connection I'm making is going to a host that is very close to me in terms of network topology it may be better to declare failures earlier using the standard TCP/IP mechanisms rather than supplimenting with ICMP. Also, the ICMP method for supplimenting TCP/IP in this way relies on not using an IP-level failover techniques between servers.

Client Failover

Quick failover follows on from discovering who to talk to. The same kinds of failture detection mechanisms are required. Fundamentally clients must be able to quickly detect any premature loss of their subscription resource and recreate it. This is made more complicated by the different server side implementations that may make subscription loss more or less likely, and thus the necessary corrective actions that clients may need to take. If a subscription is lost when a single server host fails, it is important that clients check their subscriptions often and also monitor the state of the host that is maintaining their subscription resource. If the host goes down then the subscription must be reestablished as soon as this is discovered. As such the subscription must be periodically tested for existence, preferrably through a RENEW request. Regular RENEW requests over an ICMP-supported TCP/IP connection as described above should be sufficent for even a slowly-responding server application to adequately inform clients that their subscriptions remain active and they should not reattempt creation.

Redundant Networks

SCADA systems typically utilise redundant networks as well as redundant servers. Not only can clients access the servers on two different physical media, the servers can to the same to clients. Like server failover, this could be dealt with at the IP level... however your IP stack would need to work in a very well-defined way with respect to packets you send. I would suggest that each packet be sent to both networks with duplicates discarded on the recieving end. This would very neatly deal with temporary outages in either network without any delays or network hiccups. Ultimately the whole system must be able to run over the single network, so trying to load balance while both are up may be hiding inherent problems in the network topology. Using them both should provide the best network architecture overall.

Unfortunately, I'm not aware of any network stacks that do what I would like. Hey, if you happen to know how to set it up feel free to drop me a line. In the mean-time this is usually dealt with at the application level with two IP addresses per machine. I tell you what: This complicates matters more than you'd think. You end up needing a DNS name for the whole server pair with four IP addresses. You then need an additional DNS name for each of the servers, each with two IP addresses. When you subscribe to a resource you specify the whole server pair DNS name on connection, but the subscrpition resource may only exist on one service. It would be returned with only that sevice's DNS name, but that's still two IP addresses to deal with and ping. All the way through your code you have to deal with this multiple address problem. In the end it doesn't cause a huge theoretical problem to deal with this at the application level, but it does make development and testing a pain in the arse all around.

Conclusion

Because this is all SIL2 software you end up having to write most of it yourself. I've been developing HTTP client and sever software is spurts over the last six months or so, but concertedly over the last few weeks. The beauty is that once you have the bits that need to be SIL2 in place you can access them with off the shelf implementation of both interfaces. Mozilla and curl both get a big workout on my desktop. I expect Apache, maybe Tomcat or Websphere will start getting a workout soon. By rearchitecting around existing web standards it should make it easier for me to produce non-SIL2 implementations of the same basic principles. Parts of the SCADA system that are not safety-related could be built out of commodity components while the ones that are can still work through carefully-crafted proprietary implementations. It's also possible that off the shelf implementations will eventually become so accepted in the industry that they can be used where safety is an issue. We may one day think of apache like we do the operating systems we use. They provide a commodity service that we undertand and have validated very well in our own industry and environment to help us to only have to write software that really adds value to our customers.

On that note, we do have a few jobs going at Westinghouse Rail Systems Australia's Brisbane office to support a few projects that are coming up. Hmm... I don't seem to be able to find them on seek. Email me if you're intersted and I'll pass them on to my manager. You'd be best to use my ben.carlyle at invensys.com address for this purpose.

Benjamin

Sun, 2005-Nov-13

The Makings of a Good HTTP API

I've had the opportunity over the last few weeks to develop my ideas about how to build APIs for interfacing over HTTP. Coming from the REST world view I don't see a WSDL or IDL -derived header file or baseclass definition as a fundamentally useful level of abstraction. I want to get at the contents of my HTTP messages in the way the HTTP protocol demands, but I may also want to do some interfacing to other protocols.

The first component of a good internet-facing API is decent URI parsing. Most URI parsing APIs of today use the old rfc2396 model of a URI. This model was complex, allowing only a very basic level of URI parsing without knowledge of the URI scheme. For example, a http URI reference such as https://example.com:8080/some/path?query#fragment could be broken into

scheme
http
authority
example.com:8080
path
/some/path
query
query
fragment
fragment

while an unknown URI could only be deconstructed into "scheme" and "scheme-specific-part". A URI parser that understood HTTP and another that did not would produce different results!

January 2005's rfc3986 maps out a solution to URI parsing that doesn't depend on whether you understand the URI scheme or not. All URIs must now conform to the generic syntax of (scheme, authority, path, query, fragment), but all elements of the URI except the path are strictly optional. This is great for API authors who want to provide a consistent interface, however most APIs for URI handling were developed before 2005 and feel clunky in light of the newer definitions. A good API is necessarily a post January 2005 API.

Once you have your URI handling API in place, the next thing to consider is how your client and server APIs work. Java makes a cardinal error on both sides of this equation by defining a set of HTTP verbs it knows how to use, and effectively prohibiting the transport of other verbs. In fact, the set of HTTP verbs has changed over time and may continue to change. Extensions like WEBDAV and those required to support subscription are important considerations in desiging a general purpose interface of this kind. rfc2616 is clear that extension methods are part of the HTTP protocol, and that there is a natural expectation that methods defined outside the core standard will be seen in the wild. A client API should behave like a proxy that passes requests through that it does not understand. It should invalidate any cache entries it may have associated with the named resource, but otherwise trust that the client code knows what it is doing.

On the server side the option to handle requests that your API never dreamed of is just as important. Java embeds the operations "GET", "HEAD", "OPTIONS", "POST", "PUT", "DELETE", and "TRACE" into its HttpServlet class, but this is a mistake. If anything this is a REST resource, rather than a simple HTTP resource. The problem is that your view of REST and mine may differ. REST only says that a standard set of methods be used. It doesn't say what those methods are. GET, HEAD, OPTIONS, POST, PUT, DELETE, and TRACE have emerged from many years of standardisation activity and from use in the wild... however other methods have been tried along the way and more will be tried in the future. HttpServlet should be what it says it is and let met get at any method tried on me. I should be able to define my own "RestServlet" class with my own concept of the set of standard verbs if I like. Using this Java interface I have to override the service method and do an ugly call up to the parent class to finish the job it one of my own methods isn't hit. Python (and various other languages, such as smalltak) actually allow the neatest solution to this problem: Just call the method and get an exception thrown if one doesn't exist. No need to override anything but the methods you understand.

Another thing I've found useful is to separate the set of end-to-end headers from those that are hop-by-hop. When developing a HTTP proxy it is important that some headers be stripped from any request before passing it on. I've found that putting those headers into a separate map from those end-to-end headers makes life simpler, and since these headers usually carry a level of detail that regular clients don't need be involed with they can be handed into the request formatting and transmission process separately. That way API-added headers and client added headers don't have to be combined.

I guess this brings me to my final few criticisms of the j2se and j2ee HTTP APIs. I think it's worthwhile having a shared concept of what a HTTP message looks like between client and server. Currently the servlet model requires HttpServletRequest and HttpServletResponse objects, however the client API has a HttpURLConnection class that has no relationship to either object. Also, the HttpURLConnection class itself looks nothing like a servlet. If we had started from a RESTful perspective, I would suggest that the definition of a servlet (a resource) and the definitions of the messages that pass between resources would be the first items on the list. It would certainly make writing HTTP proxies in Java easier, and should be more consistent overall. In fact there is very little difference between HTTP request and response messages, so they could share a common baseclass. There is very little difference between HTTP and SMTP messages, once you boil away the hop-by-hop headers. There are even some good synergies with FTP, and any other protocol that uses a URI for location. Transferring data between these different protocols shouldn't be difficult with a basic model of resources in place internal to your program.

I think that ultimately the most successful APIs will attempt to model the web and the internet within your program rather than simply provide onramps for access to different protocols. The web does not have a tightly-controlled data model, even at the protocol level. It's important to keep things light and easy rather than tying them down in an overly strict and strongly-typed Object-Oriented way. The web isn't like that, and to some extent I believe that our programming styles should be shifting away also. There's always going to be a need to express something that two objects in an overall system will understand completely, but those objects in-between that have to handle requests and responses have only a sketchy picture of.

Benjamin

Sun, 2005-Nov-06

Microformats

I have been reading about microformats in various blogs for a while, but only recently decided to go and see what they actually were. I'm a believer. Here is an example from the hCalendar microformat:

Web 2.0 Conference: October 5- 7, at the Argent Hotel, San Francisco, CA

It's just a snippet of xhtml, but it has embedded machine-readable markup, as follows:

<span class="vevent">
 <a class="url" href="https://www.web2con.com/">
  <span class="summary">Web 2.0 Conference</span>: 
  <abbr class="dtstart" title="2005-10-05">October 5</abbr>-
  <abbr class="dtend" title="2005-10-08">7</abbr>,
 at the <span class="location">Argent Hotel, San Francisco, CA</span>
 </a>
</span>

The same information could have been encoded in a separate calendar file or into hidden metadata in the xhtml, however the microformat approach allows the data to be written once in a visually verifiable way rather than repeating it in several different places. Using this method the human and the machine are looking at the same input and processing it in different ways.

Here is my quick summary of how to use a microformat in your html document, summarised from the hCalendar design principles:

  1. Use standard xhtml markup, just as you would if you weren't applying a microformat
  2. Add <span> or <div> tags for data that isn't naturally held within appearance-affecting markup
  3. Use class attributes on the relevant xhtml nodes for your data
  4. Where the machine readable data really can't be displayed as is, use <abbr> tags and insert the machine-readble form in the title attribute

Ian Davis has been working on a microformat for rdf. This neatly allows the microformat approach to be applied to foaf data and other rdf models. To demonstrate how cool this is I've embedded some foaf and dublin core metadata into my blog main page. You can access this data directly with an appropriate parser, or take advantage of Ian's online extractor to read the metadata in a more traditional rdf-in-xml encoding.

Benjamin

Tue, 2005-Nov-01

Open Source Capitalism, or "How to run your project as a business"

I wrote recently about the AJ Market, which allows people or organisations with a few dollars to spare to influence the focus of AJ's contribution to open source software development. If I contradict myself this time around please forgive me. I was running a little low on the sleep tank during my first attempt at a brain dump on the subject. This time around I'll try to stick to a few fundamentals.

Who writes the source?

If an open source software developer is to make money writing open source, he or she must be paid up front rather than making up the up front costs in license fees. There are different motiviations for funding open source. The contributor may be able to make direct use of the software produced. They may feel they can make money out of complimentary products such as services. They may be trying to favour curry with individuals in the software's target audience, leading to a return of good faith. The contributor may or may not be the same person as the developer. Traditionally the two have consistently been the same person. Developers wrote software to "scratch an itch" that affected them personally. This is a great model of software development where the software's users are the people most actively involved in driving the product as a whole. I see the possibility of opening up this market to also include people who can't scratch their itch directly but have the money to pay someone to do it.

The choice of license

Firstly, I think it is important to have a license that promotes trust in the community of users. My experience is that the GPL does this effectively in many cases by guaranteeing contributions are not unfairly leveraged by non-contributors. Eric Raymond chants that the GPL is no longer necessary because business sees open source as the most productive way forward and that businesses who fail to see this will fail to prosper. I disagree on the basis that through all of nature there are always some cheats. What works on the grand economic scale doesn't always make immediate business sense. The search for short term gain can wipe out trust and cooperation too quickly to give up on the protections that the GPL provides. When the global community of nation states no longer needs treaties to prevent or limit the use of tarriffs I'll start to look again at whether the GPL is required.

Voting with your wallets

My view of an economically vibrant project starts with Bugzilla. I intuitively like the concept of a bounty system in free software and think it ties in nicely with eXtreme Programming (XP) concepts. When you provide bounties for new work you should increase the supply of developers willing to do the work associated with the bug. When you allow the contribution of bounties to be performed by the broader user base you may align the supply of software development hours to the needs of the customer base. Bugzilla already has a concept of voting where registered users indicate bugs they really care about by incrementing a counter. If that counter were powered by dollars rather than clicks the figure may be both a more accurate statement of the desirability of a fix and a more motivating incentive for developers to contribute.

The tie in to me with XP is in breaking down the barrier between developer and customer. As I mentioned earlier, they are often already the same people in open source. An open flow of ideas of what work is most important and what the work's success criteria are is proving important in software development generally. In XP a customer representative is available at all times for developers to speak to. In open source, bugzilla is an open channel for discussion outside regular mailing lists. Adding money to the process may be a natural evolution of the bug database's role.

A question of scale

As I mentioned in my earlier article, the biggest problem in setting up a marketplace is ensuring that there is enough money floating around to provide meaningful signals. If a software developer can't make a living wage working on a reasonable number of projects this form of monetary open source is never going to really work. Open source also has a way of reducing duplication that might otherwise occur in proprietary software, so there is potentially far less work out there for software developers in a completely free software market. Whether it could really work or not is still a hypothetical question and starting work now on systems that support the model may still be exiting enough to produce reasonable information. Even if this model could only pay for one day a week of full-time software development it would be a massive achievement.

Governance and Process

If your project is a business you can pretty much run it any way you choose, but the data you collect and transmit through a closed system will be of lower quality than that of a free market. To run your project as a market I think you need a significant degree of transparancy to your process. I suggest having open books in your accounting system. When someone contributes to the resolution of a bug it should be recorded as money transferred from a bug-specific liability account to your cash at bank account. When the conditions for payout are met money should be tranferred from a developer-specific payable account back to the bug. Finally payment should be made from the cash at bank account to the payable account to close the loop. If the project has costs or is intended to run at a profit some percentage of the original transfer should be transferred from an appropriate income account rather than being transferred wholely from the liability. This approach clearly tracks the amount invested by users in each bug and transparently indicates the amounts being payed to each developer. I suggest that while contributions may be anonymous that developer payments should be clealy auditable.

Estimates and Bug Lifecycle

I'm going back on my previous suggestion to provide work estimates on each bug. I'm now suggesting that the amount of interest in supply be the main feedback mechanism for users who want to know how much a bug resolution is worth. More factors than that simply of money contribute to the amount of supply available for fixing a bug. There is the complexity of the code that is needed to work with to consider. There is the fun factor, as well as the necessity to interface to other groups and spend additional time. I would also suggest that different phases in a bug's lifecycle may be worth contributing to explicitly. If a bug is in the NEW state then money may be made available to investigate and specify the success criteria of later stages. Contributions may be separate for the development phase and a separate review phase. Alternatively, fixed percentages may be supplied to allow different people to become involved during different stages.

Bug Assignment Stability

Stablility of bug assignments is important as soon as money comes into the equation. There's no point in having all developers working individually towards the bug with the biggest payoff, only to find the bug is already fixed and payed out when they go to commit. Likewise, showing favouratism in assigning high value bugs the the same people every time could be deadly to project morale. I would take another leaf out of the eXtreme Programming book and suggest that leases be placed on bug assignments. The developer willing to put in the shortest lease time should win a lease bidding war. Once the lease is won the bug is left with them for that period. If they take longer then reassignment may occur.

Benjamin

Sun, 2005-Oct-30

Was Windows at fault in the customs computer crash?

Disclaimer: I do not speak for my employer, and their views may differ from my own.

Well, obviously no it wasn't. Pascal Klien worries about the inability to sue Microsoft over the failure of Botany Bay customs computers here in Australia. As a software engineer who works in industries where the kind of money he mentions are thrown around I can tell you that the operating system is usually not the hardest working component of the system. We still do a fair bit for our money. I can tell you that whether we were delivering on Windows, Linux, or Solaris the hardest working pieces of the operational system will be hardware and our software rather than any commodity operating system. The OS provides a few essential commodity services, but it doesn't have any great influcence over system performance in even moderately complicated systems. It doesn't greatly affect scalabilty. Most scaling and performance capability comes down to how you structure a cluster of several machines rather than how any one piece behaves.

Not knowing all the finer details of this case I find the chance that the operating system played any direct role in these problems remote. I also find the suggestion that not being able to sue Microsoft for failures is a little absurd coming from a free software advoate. Should Linus allow himself to be sued when Linux is used in mission-critical applications? Surely not. It is the responsibility of the contractor who sold Customs their new computer system to make sure there are no problems, and you can be sure they are both sueable and that they have sufficient insurance or collateral to pay out any claim. That's assuming that Customs gave them accurate and appropriate specs. If not, then Customs is to blame here and the contractor should be being loaded up with money sufficient to have come in and save the day.

If a contractor chooses to reduce their costs by using commodity software such as operating systems in mission-critical environments they must also bear responsibility for the adequacy of that software in their own operational profile. Writing their own operating system is almost certainly counter-productive for the kind of system being examined here, and as far as commodity operating systems go Windows has a long and sufficiently good track record in many industries.

Benjamin

Sun, 2005-Oct-23

Selling developer hours (generalising the AJ market)

Anthony Towns updates us on the status of the AJ market. He's a free software developer who is heavily involved in the Debian project.

Putting money in the market

I believe Anthony is hoping to do a number things with this market. Benefits that he might be able to achieve include:

These are all benefits that putting money and an economy into the system should be a good means of achieving. You want a new feature but can't do it yourself? Put your money where your mouth is and pay the man who can. A market that uses a low-inflation commodity to send signals between buyer and seller should provide accurate information about what most needs achieving. In a free software world commodities of money and developer time are both low inflation so probably both have a role to play.

Possible problems with AJ's market

Such an economy does need to be of sufficient scale to keep noise low. At the dollar figures AJ has reeled in so far it is an even bet whether the information he is recieving about customer demands is representative of his userbase or not. He may also not sending effective signals back to buyers about what is possible given the time available. His market information bulletin doesn't indicate how much he expects to be paid for any of the items in his todo list. He doesn't quote hours estimates and the bids are not for hours but features of possibly irregular size. Without a mechanism where AJ is able to say "I'm not willing to do this work for less than..." the market is only a one-way information system. These problems may be shaken out with time or prove irrelevant as obviously this market is very new indeed.

I wish, I wish

I've occasionally dreamed about a market like this myself. A nobody in the free software movement like myself would probably not get very far marketing in this way, at least initially. That has put me off investing any significant time in a model. I've tended to dislike the service-orientated redhat or mysql business models. I think these approaches encourage paid developers to think in terms of the services they are going to provide rather than in terms of the quality of the software. This could actively work against software quality by encouraging turnkey "just works" features to be excluded as reducing the value of services that might otherwise be provided. It may be better to use business models that directly encourage end user solutions that are of the highest quality necessary to do the job.

Problems with the proprietary approach

Proprietary software development is based around the production of a feature set that can sold over and over again. In the case of companies like microsoft this can lead to huge profit margins, which equate to inefficiency in the economy as a whole. If features have already been developed then making businesses pay for them over and over again is counter-productive. The amount that the business can spend on its own needs is reduced by the paying of licence fees while the software company itself is not solving new problems for the business. An efficient model would see the software company make a fair profit on the costs to produce features for other businesses while retaining the incentive to produce new features and further improve the productivity of their customers. Software companys that aren't producing more and better efficiency improvements for their customers should wither and die.

Problems with the free approach

In any free model you either need to get the money to produce features up front or you need to make the money back on services. As I mentioned earlier I'm not a fan of the services avenue, although I admit this can be a handy revenue stream. AJ's market attempts to make the money up front because once the work is done and is freely distributed there isn't going to be any more money coming in for that feature or efficiency improvement. Proprietary companies can afford to take a loss during development and make the money up later in sales. Free developers must be paid in advance.

As a free developer you really have two choices:

  1. You team up with an entity with deep pockets that wants your software
  2. You provide a means for collective entities with deep pockets to form

We see the first choice at work in services companies. They are willing to pay developers so that they can earn dollars providing services and addons to the software developed. The problem dealing with an entity of this type is that greed will eventually change your relationship. The company will want to see something from the development that its competitors don't have. This desire may conflict with a free software sensibility. It may only be "we want to have this for six months before everyone else". That's how copyright and patents work, of course. From experience with those two vehicles we know that companies will inevitably want longer and longer periods attached. It may come in the form of "we want this proprietary addon". The only counters to these forms of greed would be outrage in the customer community who really don't want the lock-in.

The second approach is that of AJ's market. Try to attract ad hoc collectives who want a feature to supply small contributions to the goal. To get as many people together as a proprietary company does with its license fee model may be a difficult feat, but you know that you're working towards real customer needs when a large number are willing to supply even small amounts.

The role of eXtreme Programming

Earlier in this article I critisized AJ for not supplying estimates for work and thus not providing adequate feedback to buyers in his economy. I've actually always seen this kind of market through the XP model and viewed it alongside the bounty system sometimes seen in free softare. The basic framework would consist of the following stages:

  1. Define a set of concrete, testable goals
  2. Allow bidding for a concrete non-inflationary commodity towards those goals
  3. Work for the highest bidder according to the amount of commodity bought
  4. Allow goals to be changed and rebid

This approach would accomodate two-way feedback between buyer and seller, including allowance for goal correction should the costs begin to outweigh the benefits of a particular unit of work. I propose the following implementations of each stage:

  1. Definition of testable goals
    Buyers should be able to submit goals to the seller. Seller should make appropriate adjustments to keep scope within a basic regular time period, such as splitting up the work into smaller units. Sellers should ignore or negotiate changes to goals that fall outside the basic project charter. Sellers may also propose and lobby for their own changes and may submit free hours towards goals they are passionate about.
  2. Bidding

    I think AJ has the basic bidding structure right. A bid is an actual payment of dollars rather than a promise of dollars. An alternative model would be to do an eBay-style bidding and invoicing separation where only the work with the highest return would be invoiced. Due the the nature of this work where weeks may pass between the first pledge of money for a particular bug and the actioning of work associated with the bug an invoicing system is probably too complex with the additional complication of having to price in non-payment.

    I think that bidding should be in the form of dollars for hours. Both are non-inflatable commodities and a fair average price should be able to be determined over the longer term. It should be possible to compare the hourly rate over a long period of time for trends and it should be possible for buyers to see what other people think an hour of your time is worth. Buyers may also submit code they hope will reduce or eliminate the effort required to fulfil their chosen goals.

    In my view, the best way to achieve a dollars for hours bidding model is to associate an estimate with each bug. Buyers make payments indicating their preferred bug. The payment total is divided by the estimate to produce an hourly rate. The seller works on the bug with the highest available hourly rate.

    The problem with bidding dollars for hours is how to handle inaccurate estimation. You could spend the estimated hours, pocket the money, and update the estimate for the bug. That might strain relations with your userbase and probably requires refunds for bugs that take less than the allotted time to resolve. Another approach would be for the seller to wear the risk of inaccurate estimates by always completing the work regardless of budget overrun. This would encourage the seller to underpromise, which is usually a good thing for a seller to do.

  3. Working
    There should be clear targets for time worked and the tests for completion should be defined by buyers as part of the goal definition. These should be prioritised so that the most important tests are met early. Bidders should be able to be tracked and contacted during the purchased time for clarification to ensure that the most applicable work is produced.
  4. Rebidding
    Goal changes should be made as necessary with the winning bidders for your time. Once the won hours have been expended buyers may choose to bid for more hours towards their goal or may abandon the goal as uneconomic.

Conclusion

Maybe someone can make use of these ideas. Maybe one day I'll be able to make use of the myself :) In the end software freedom should act as a contract to ensure successful bidders don't get a leg up on each other and that all market forces are openly played out. I hope that great free software may come out of a collective monetary process in the future. At present we have a system based on bidders paying with their own hours, and that's great... however it would be nice to see people with money be able to play the game as well and most importantly it would be nice to see free software being able to be created professionally without the need for paid services to fund the professionals

Benjamin

Sun, 2005-Oct-16

Efficiency of Internet-scale Subscription Leases

Subscription requires synchronisation of subscription state between client and server. Failliable clients and servers may forget about subscriptions. To avoid these stale subscriptions consuming unnecessary resources we lease the subscription. If it is forgotten about and not renewed it will eventually be purged. Lease renewal also allows clients to know whether their own subscription is stale. For clients the need is more pressing as they are not just concerned about resource consumption. They have service obligations that may not be met by a stale subscription.

When client and server correspond using a single subscription leasing is a simple and effective means of ensuring resources aren't tied up and clients can meet their obligations when many subscriptions are in play. Between client and server the effect is less efficient. For n subscriptions between two points, at least n messages are sent per lease duration for long-lived subscriptions. If the client is renewing its subscription more frequently than the rate determined by the lease duration the n requests occur over a correspondingly shorter period.

This is less efficient than could be achieved. Instead of making n requests per period the client could make only one. This should be enough to answer the two questions at issue: "Has my client forgotten about me?" and "Has my server forgotten about me?". This approach would change the subscription mechanism into a more general heartbeating model.

Heartbeating complicates the subscription model. A simple "I'm here" message isn't enough. It is important to tie the set of subscriptions to the identifier of exactly which client is "here". If the client fails to make contact in a designated time all subscriptions would be scrubbed. If the client successfully makes contact its subscriptions would continue to be honoured. This leads to one more thing that must be synchronised between client and server: The set of subscriptions associated with a particular client identifier.

If the client always carries the same identification (say, an IP address) it may find that it successfully renews its subscriptions, but also the set of subscriptions it successfully renewed is much smaller than anticipated. If several subscriptions are established before a server briefly goes down and several more after it is restored to service, it makes sense that the client could renew "all of it's subscriptions" without the client or the server ever becoming the wiser that half of them were lost over the restart. There must be some concept of a session that is unique across both client and server failure events.

Here's how it might work:

  1. Client requests session creation
  2. Server creates session resource and returns URI
  3. Client makes any number of subscriptions, each quoting the session URI
  4. Client periodically renews the session

This approach could still be supported alongside normal lease-based subscription. If no session ID is quoted by a client the server could safely follow standard leasing semantics. If no session creation resource is known by the client it could assume that the server doesn't understand session-based leasing. Server code could run the normal lease expiry process for each subscription, except instead of simply testing "Did I get a renew request this period?", it would ask "Did I get a renewal reuqest this period? If not, does my session still exist?"

A simpler way to achieve session handling that I've used in the past has been to treat a TCP/IP connection as the session. Heartbeats are periodically sent down the connection by both client and server ends to establish liveness. If either end passes data across the connection then there is no need to send heartbeat messages in that period. Closure of the connection terminates all related subscriptions. Unfortunately this model does not work with HTTP, especially once proxies get into the mix. In a HTTP world you can't rely on TCP/IP connection stability.

Admittedly, there is another way to go. You could set up resource groups rather than client<->server sessions. OPC more or less does this in its subscription model. A group can either be predefined by the server for subscription, or can be set up by the client in a way equivalent to using a session. The client then just subscribes to the group, so a straightforward leasing model applies. Clients create group subscriptions and renew group subscriptions. This approach negates problems of efficiency in a similar way to session-based leasing.

Benjamin

Sat, 2005-Oct-15

Internet-scale Subscription Lease Durations

Depending on your standpoint you may have different ideas about how long a subscription lease should be. From an independent standpoint we may say that infinite subscription leases are the best way forward. That produces the lowest overall network and processing overhead and thus the best result overall. There are, however, competing interests that influence this number downwards. It is likely the lease should be of finite duration and that duration is likely to count more on the reliability of the server and the demands of the client than on anything else.

As a server I want to free up resources as soon as I can after clients that are uncontactable go away. This is especially the case when the same clients may have reregistered and are effectively consuming my resources twice. The new live registration takes up legitimate resources, but the stale ghost registration takes additional illegitimate resources. I want to balance the cost of holding onto resources against the cost of subscription renewals to decide my desired lease period. I'll probably choose something in the order of the time it takes for a tcp/ip connection to expire, but may choose a smaller number if I expect this to be happening regularly. I don't have an imperative to clean up except for resource consumption. In fact, whenever I'm delivering messages to clients that are up but have forgotten about their subscriptions I should get feedback from them indicating they think I'm sending them spam. It's only subscriptions that are both stale and inactive that chew my resources unnecessarily, and it doesn't cost a lot to manage a subscription in that mode.

As a client, if I lease a subscription I expect the subscription to be honoured. That is to say that I expect to be given timely updates of the information I requested. By timely I mean that I couldn't get the information any sooner by polling. Waiting for the notification should get me the data first. The risk to a client is that the subscription will not be honoured. I may get notifications too late. More importantly my subscription might be lost entirely. REST says that the state of any client and server interaction should be held within the last message that passed between them. Subscription puts a spanner in these works and places an expectation of synchronised interaction state between a falliable client and server.

Depending on the server implementation it may be possible to see a server fail and come back up without any saved subscriptions. It might fail over to a backup instance that isn't aware of some or all of the subscriptions. This would introduce a risk to the client that its data isn't timely. The client might get its data more quickly by polling, or by checking or renewing the subscription at the rate it would otherwise poll. This period for sending renewal messages is defined by need rather than simple resource utilisation. The client must have the data in a timely manner or it may fail to meet its own service obligations. Seconds may count. It must check the subscription over a shorter duration than the limit it itself can have on how out of date its data may be under these circumstances. If it is responsible for getting data to an operator console from the field within five (5) seconds it must check its subscription more frequently than at that rate, or someone must do it on their behalf.

Non-failure subscription loss conditions may exist. It may be more convenient for a server to drop subscriptions and allow clients to resubscribe than to maintain them over certain internal reconfiguration activities. These cases are potentially easier to resolve than server death. They don't result in system failure so the owner of subscriptions can notifiy clients as appropriate. It must in fact do so, and once clients have recieved timely update of the end of their subscriptions they should be free to attempt resubscription. It is server death which is tricky. Whichever way you paint things there is always the chance your server and its cluster will terminate in such a way that your subscriptions are lost. Clients must be able to measure this risk and poll at a rate that provides adequate certainty that timely updates are still being sent.

Benjamin

Sat, 2005-Oct-08

Application Consolidation

Sean McGrath talks about consolidating the data from different applications for the purpose of business reporting. He says that the wrong way to do it is usually to redevelop the functions of both applications into one. He says the right way to do it is usually to grab reports from both system and combine them. There are two issues that must be solved during this process. The first is one of simultaneous agreement. The second is a common language of discourse. I'll address the second point first.

Without a common terminology the information of the two systems can't be combined. If one system thinks of your business as the monitoring of network equipment by asset id while another thinks of your business as the monitoring of network equipment by IP address the results aren't going to mesh together well. You need a reasonably sophisticated tool to combine the data, especially when there isn't a 1:1 mapping between asset id and IP address.

Without simultaneous agreement you may be combining reports that are about different things. If a new customer signs on and has been entered only into one system the combined report may be a nonsense. Even if they are enetered at the same time there is necessarily some difference beteween the time queries are issued to each system. The greater the latency between the report consolidation application and the original systems the less likely it is that the data you get back will still be correct when it is recieved. The greater the difference in latencies between the systems you are consolidating the greater the likelyhood that those reports will be referring to different data. This problem is discussed in some detail in Rohit Khare's 2003 dissertation from the point of view of a single datum. I suspect the results regarding simultaneous agreement for different data will be even more complicated.

If the report is historical in nature or for some other reason isn't sensitive to instantaneous change and if the systems you are consolidating do speak a common language, I suggest that Sean is right. Writing a report consolidator application is probably going to be easier than redeveloping the applications themselves. If you lie in the middle somewhere you'll have some thinking to do.

Benjamin

Sat, 2005-Oct-01

Bootchart under Debian

My wife has been complaining for a while that my computer is a little slow to boot. It's an aging PC with an 800MHz celeron processor and 384 meg of RAM. Actually, the motherboard on this machine was changed at one point and cat /proc/cpuinfo claims the chip runs at 565MHz so I'm suspicious something was diddled at the time.

I pulled down the bootchart package to have a look at how long things were really taking. Lo and behold she was not exaggerating. My machine was taking 154 seconds to boot. Further investigation showed that hotplug seemed to account for much of the time. hotplug ran from just before the 24 second mark to just before 122 seconds. Following some advice found on the internet I decided to mess with hotplug a bit. The first step was just backgrounding hotplug during boot so that other operations could run in parallel by modifying /etc/init.d/hotplug. This yielded a good speedup with boot time down to 102 seconds, a 52 second improvement. The only ill effect I got from this is my net didn't get initialised. I suspect that the separate hotplug net script didn't like running before hotplug had finished its startup, however I haven't fully explored this problem yet. It's easy enough to do an "ifup eth1" once the boot is finished. It also probably wouldn't have worked except that I still have a reasonably complete /etc/modules file that loads my mouse drivers and the like.

The new bootchart still showed a problem. Hotplug started just before the 30 second mark and finished (possibly was truncated) at the 102 second mark. For that entire time the CPU was maxed out, just as it had been during the hotplug startup when it was not executing in parallel with other activities. I decided to solve the hotplug startup issue conclusively by adding a sleep before it executed in /etc/init.d/hotplug. /etc/init.d/hotplug now has the effect of returning immediately, sleeping for 60 seconds in parallel, and starting hotplug at the expiry of those 60 seconds. This yeilded the best results so far. My machine now boots in 66 seconds, an 88 second or 57% reduction. I wonder if my wife will notice...

Clearly, hotplug's startup is a problem on low-end (cpu-bound) hardware. I don't recommend running it (or at least I don't recommend starting it up as part of your normal boot process) on slow a CPU. This may improve in the future with alternate implementations to the current bash shell scripting approach starting to emerge. A couple of sample bootcharts are available comparing the bash version to a rewrite in perl. I've even seen a compiled version written in C proposed.

A couple of notes on bootchart itself: It works well. I use grub to boot and manually modified the kernel command line each time I wanted bootchart to run. This was just a matter of pressing "e" before the boot process began, positioning the cursor over the kernel command-line, pressing "e" again, appending "init=/sbin/bootchartd" and using "b" to get the boot process started. My only frustration with the bootchart program that generates the pretty bootchart diagrams is that if you stuffed up step one and don't have any bootchart data from which to produce the diagram it returns silenty without explaining what went wrong or where it is looking for the information. Once the information was actually there it worked without any hassles. Well done to its contributors.

Update:
Ziga Mahkovec wrote to me via email on Monday October 3, 2005 regarding the silent return when no data was available:

This was actually a bug that I fixed in CVS now -- the renderer will report a "/var/log/bootchart.tgz not found" error.

Great job!

Update:
I've upgraded to the latest Debian unstable udev as of October 15, 2005. This package replaces the hotplug scripts with its own event generation system. My bootchart now weighs in at 95s. This is a significant improvement over the older hotplug time of 154s, and only adds about 50% over the theoretical minimum of not doing any hotplug work. The bootchart still shows my CPU being maxed out for this time with no I/O. This might be the ideal time to start preloading files into memory to improve the performance of the subsequent boot load process. I'm having trouble with correct initialisation a few devices in this setup. My sound card device isn't being created despite relevant modules being loaded correctly, so I had to tweak /etc/udev/links.conf. My usb-connected motorola sb4200 cable modem also gets its modules loaded but doesn't automatically ifup at the moment. I have just been executing ifup eth1 after boot, but I may go searching for the right place to hack this into the standard boot process.

With this performance work well underway and clear improvements showing I'm confident that the linux boot process will continue to be refined and streamlined. Good work to the udev developers, and also to those on the gnome camp who seem to be getting very busy around the topic of gnome startup for the upcoming 2.14 release.

Benjamin

Sat, 2005-Oct-01

Use of HTTP verbs in ARREST architectural style

This should be my last post on Rohit Khare's Decentralizing REST thesis. I apologise to my readers for somewhat of a blogging glut around this paper but there have been a number of topics I wanted to touch apon. This post concerns the use of HTTP verbs for subscription and notification activities.

In section 5.1 Rohit describes the A+REST architectural style. It uses a WATCH request and and NOTIFY response. By the time he reaches the R+REST and ARREST styles of sections 5.2 and 5.3 he is using SUBSCRIBE requests and POST responses. I feel that the jump to use POST (a standard HTTP request) is unfortunate.

I think Rohit sees POST as the obvious choice here. The server wants to return something to the client, therefore mutating the state of the client, therefore POST is appropriate. rfc2616 has this to say about POST:

The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions:

  • Annotation of existing resources;
  • Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles;
  • Providing a block of data, such as the result of submitting a form, to a data-handling process;
  • Extending a database through an append operation.

POST is often used outside of these kinds of context, especially as means of tunnelling alternate protocols or architectural styles over HTTP. In this case though, I think that its use is particularly aggregious. Consider this text from section 9.1.1 of rfc2616:

the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.

Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.

When the server issues a POST it should be acting on behalf of its user. Who its user is is a little unclear at this point. There is the client owned by some agency, the server owned by another, and finally the POST destination possibly owned by an additional agency. If the server is acting on behalf of its owner it should do so with extreme care and be absolutely sure it is willing to take responsibilty for the actions taken in response to the POST. It should provide its own credentials and act in a responsible manner, operating in accordance with the policy its owner sets forward for it.

I see the use of POST in this way as a great security risk. If the server generating POST requests is trusted by anybody then by using POST as a notification it is transferring that trust to its client. Unless client and server are owned by the same organisation or individual an interagency conflict exists and an unjustified trust relationship is created. Instead, the server must provide the client's credentials only or notify the destination server in a non-accountable way. It is important that the server not be seen to be requesting any of the sideeffects the client may generate in response to the notification but instead that those sideeffects are part of the intrinsic behaviour of the destination when provided with trustworthy updates.

Ultimately I don't think it is possible or reasonable for a server to present its client's credentials as its own. There is too much risk that domain name or IP address records will be taken into account when processing trust relationships. I think therefore that a new method is required, just as is provided in the GENA protocol which introduces NOTIFY for the purpose.

It's not a dumb idea to use POST as a notification mechanism. I certainly thought that way when I first came to this area. Other examples also exist. Rohit himself talks about the difficulty of introducing new methods to the web and having to work around this problem in mod_pubsub using HTTP headers. In the end though, I think that the introduction of subscription to the web is something worthy of at least one new verb.

I'm still not sure whether an explicit SUBSCRIBE verb is required. Something tells me that a subscribable resource will always be subscribed to anyway and that GET is ultimately enough to setup a subscription if appropriate headers are supplied. I'm still hoping I'll be able to do something in this area to reach a standard approach.

The established use of GENA in UPnP may tip the cards. The fact that it exists and is widely deployed may outweigh its lack of formal specification and standardisation. Its defects mostly appear asthetic rather than fundamental, and it still may be possible to extend it in a backwards-compatible way.

Benjamin

Sat, 2005-Oct-01

Infinite Buffering

Rohit Khare's 2003 paper on Decentralizing REST makes an important point in section 7.2.4 during his discussion of REST+E (REST with Estimates):

It is impossible to sustain an information transfer rate in excess of a channel's capacity. Unchecked, guaranteed message delivery can inexorably increase latency. To ensure this does not occur - ensuring that buffers, even if necessary, remain finite - information buffered for transmission may need to be summarized, updated, or even dropped while still queued.

It is a classic mistake in message-oriented middleware to combine guaranteed delivery of messages with guaranteed acceptance of messages. Just when the system is overloaded or stressed, everything starts to unravel. Latency increases, and can even reach a point where feedback loops are created in the middleware: Messages designed to keep the middleware operating normally are delayed so much that they cause more messages to be generated that can eventually consume all available bandwidth and resources.

Rohit cites summarisation as a solution, and it is. On the other hand it is important to look at where this summarisation should take place. Generic middleware can't summarise data effectively. It has few choices but to drop the data. I favour an end-to-end solution for summarisation: The messaging middleware must not accept a message unless it is ready to deliver it into finite buffers. The source of the message must wait to send and when it becomes possible to send should be able to alter its choice of message. It should be able to summarise in its own way for its own purposes. Summarisation should only take place at this one location. From this location it is possible to summarise, not based on a set of messages that have not been sent, but on any other currently- accessible data. For instance, a summariser can be written that always sends the current state of a resource rather than the state it had at the first change after the last successful delivery. This doesn't require any extra state information (except for changed/not-changed) if the summariser has access to a common representation of the origin resource. The summariser's program can be as simple as:

  1. Wait for change to occur
  2. Wait until a message can be sent, ignoring further changes
  3. Send a copy of the current resource representation
  4. Repeat

More complicated summarisation is likely to be useful on complex resources such as lists. Avoiding sending the whole list over and over again can make significant inroads to reducing bandwidth and processing costs. This kind of summariser requires more sophisticated state, including the difference between the current and previosly-transmitted list contents.

Relying on end-to-end summarisers on finite buffers allows systems to operate efficiently under extreme load conditions.

Benjamin

Sat, 2005-Oct-01

REST Trust Relationships

Rohit Khare's 2003 paper Decentralizing REST introduces the ARREST architectural style for routed event notifications when agency conflicts exist. The theory is that it can be used according R+REST principles to establish communication channels between network nodes that don't trust each other but which are both trusted by a common client.

Rohit has this to say about that three-way trust relationship in chapter 5.3.2:

Note that while a subscription must be owned by the same agency that owns S, the event source, it can be created by anyone that S's owner trusts. Formally, creating a subscription does not even require the consent of D's owner [the owner of the resource being notified by S], because any resource must be prepared for the possibilty of unwanted notifications ("spam").

If Rohit was only talking about R+REST's single routed notifications I would agree. One notification of the result of some calculation should be dropped by D as spam. Certainly no unauthorised alterations to D should be permitted by D, and this is the basis of Rohit's claim that D need not authorise notifications. Importantly, however, this section is not referring to a one-off notification but to a notification stream. It is essential in this case to have the authority of D before sending more than a finite message sequence to D. Selecting a number of high-volume event sources and directing them to send notifications to an unsuspecting victim is a classic denial of service attack technique. It is therefore imperative to establish D's complicity in the subscription before pummeling the resource with an arbitrary number of notifications.

The classic email technique associated with mailing lists is to send a single email first, requesting authorisation to send further messages. If a positive confirmation is recieved to the email (either as a return email, or a web site submission) then further data can flow. Yahoo has the concept of a set of email addresses which a user has demonstrated are their own, and any new mailing list subscriptions can be requested by the authorised user to be sent to any of those addresses. New addresses require individual confirmation.

I believe that a similar technique is required for HTTP or XMPP notifications before a flood arrives. The receiving resource must acknowledge successful receipt of a single notification message before the subsequent flood is permitted. This avoids the notifying server becoming complicit in the nefarious activities of its authorised users. In the end it may come down to what those users are authorised to do and who they are authorised to do it with. Since many sites on the internet are effectively open to any user, authorised or not, the importance of handling how much trust your site has in its users may be important in the extreme.

Benjamin

Sat, 2005-Oct-01

Routed REST

I think that Rohit Khare's contribution with his 2003 paper on Decentralizing REST holds many valuable insights. In some areas, though, I think he has it wrong. One such area is his chapter on Routed REST (R+REST), chapter 5.2.

The purported reason for deriving this architectural style is to allow agencies that don't trust each other to collaborate in implementing useful functions. Trust is not uniform across the Internet. I may trust my bank and my Internet provider, but they may not trust each other. Proxies on the web may or may not be trusted, so authentication must be done in an end-to-end way between network nodes. Rohit wants to build a collaboration mechanism between servers that don't trust each other implicitly, but are instructed to trust each ohter in specific ways by their client which trusts each service along the chain.

Rohit gives the example of a client, a printer, a notary watermark, and an accounting system. The client trusts its notary watermark to sign and authenticate the printed document, and trusts the printer. The printer trusts the accounting system and uses it to bill the client. Rohit tries to include the notary watermark and the accounting system not as communication endpoints in their own right, but as proxies that transform data that passes through them. To this end he places the notary watermark between the client and printer to transform the request, but places the accounting system betwen printer and client on an alternate return route. He seems to get very excited about this kind of composition and starts talking about MustUnderstand headers and about SOAP- and WS- routing as existing implementations. The summary of communications is as follows:

  1. Send print job from client to notary
  2. Forward notarised job from notary to printer
  3. Send response not back to the notary, but to the printer's accounting system
  4. The accounting system passes the response back to the client

I think that in this chapter he's off the rails. His example can be cleanly implemented in a REST style by maintaining state at relevant points in the pipeline. Instead of transforming the client's request in transit, the client should ask the notary to sign the document and have it returned to the client. The client should only submit a notarised document to the printer. The printer in turn should negotiate with the accounting system to ensure proper billing for the client before returning any kind of success code to the client. The summary of communications is as follows:

  1. Send print job from client to notary
  2. Return notarised job from notary to client
  3. Send notarised job from client to printer
  4. Send request from printer to accounting system
  5. Receive response from accounting system back to printer
  6. Send response from printer back to client

Each interaction is independent of the others and only moves across justified trust relationships.

Rohit cites improved performance out of a routing based system. He says that only four message transmissions need to occur, instead of six using alternate approaches. His alternate approach is different to mine, but both his alternate and my alternate have the same number of messages needing transmission: six. But let's consider that TCP/IP startup requires at least three traversals of the network and also begins transmission slowly. Establishing a TCP/IP connection is quite expensive. If we consider the number of connections involved in his solution (four) compared to the number in my solution and his non-ideal alternative (three) it starts to look more like the routing approach is the less efficient one. This effect should be multiplied over high latency links. When responses are able to make use of the same TCP/IP connection as the request was made on, the system as a whole should actually be more efficient and responsive. Even when it is not more efficient, I would argue that it is signficantly simpler.

Rohit uses this style to build the ARREST style, however using this style as a basis weakens ARREST. He uses routing as a basis for subscription, however in practice whether subscription results come back over the same TCP/IP connection or are routed to a web server using a different TCP/IP connection is a matter of tradeoff of server resources and load.

Benjamin

Sat, 2005-Oct-01

The Estimated Web

Rohit Khare's dissertation describes the ARREST+E architectural style for building estimated systems on a web-like foundation. He derives this architecture based on styles developed to reduce the time it takes for a client and server to reach consensus on a value (they have the same value). The styles he derives from are based on leases of values which mirror the web's cache expiry model. An assumption of this work is that servers must not change their value until all caches have expired, which is to say that the most recent expiry date provided to any client has passed. He doesn't explicitly cover the way the web already works as an estimated system: by lying about the expiry field.

Most applications on the web do not treat the expirty date as a lease that must be honoured. Instead, they change whenever they want to change and provide an expiry date to clients that represents a value much less than they expect the real page to change at. Most DNS server records remain static for at least months at a time, so an expiry model that permits caching for 24 hours saves bandwith while still ensuring that clients have a high probability of having records that are correct. Simply speaking, if a DNS record changes once every thirty days then a one-day cache expiry gives clients something like a 29 in 30 chance of having data that is up to date. In the mean time caching saves each DNS server from having to answer the same queries every time someone wants to look up a web page or send some email.

The question on the web of today becomes "What will it cost me to have clients with the wrong data?". If you're managing a careful transition then both old and new DNS records should be valid for the duration of the cache expiry. In this case it costs nothing for clients to have the old data. Because it costs them nothing it also costs you nothing, excepting the cost of such a careful transition. When we're talking about web resources the problem becomes more complicated because of the wide variety of things that a web resource could represent. If it is time-critical data of commercial importance to your clients then even short delays in getting the freshest data can be a problem for clients. Clients wishing to have a commercial advantage over each other are likely to poll the resource rapidly to ensure they get the new result first. Whether cache expiry dates are used or not clients will consume your bandwidth and processing power in proportion to their interest.

The ARREST+E style provides an alternative model. Clients no longer need to poll because you're giving them the freshest data possible as it arrives. ARREST+E invovles subscription by clients and notification back to the clients as data changes. It allows for summarisation of data to avoid irrelevant information from being transmitted (such as stale updates) and also for prediction on the client side to try and reduce the estimated error in their copy of the data. If your clients simply must know the result and are willing to compete with each other by effectively launching denial of service attacks on your server then the extra costs of ARREST+E style may be worth it. Storing state for each subscription (including summariser state) may be cheaper than handling the excess load.

On the other hand, most of the Internet doesn't work this way. Clients are polite because they don't have a strong commercial interest in changes to your data. Clients that aren't polite can be denied access for a while until their manners improve. A server-side choice of how good an estimate your copy of the resource representation will be is enough to satisfy everyone.

Whether or not you use ARREST+E with its subscription model some aspects of the architecture may still be of some use. I periodically see complaints about how inefficient HTTP is at accessing RSS feeds. A summariser could reduce the bandwith associated with transferring large feeds. It would be nice to be able to detect feed reader clients to a resource specifically tailored for them each time they arrive. Consider the possibility of returning today's feed data to a client, but directing them next time to a resource that represents only feed items after today. Each time they get a new set of items they would be directed onto a resource that represents items since that set.

Benjamin

Sat, 2005-Oct-01

Consensus on the Internet Scale

I've had a little downtime this week and have used some of it to read a 2003 paper by Rohit Khare, Extending the REpresentational State Transfer (REST) Architectural Style for Decentralized Systems. As of 2002, Rohit was a founder and Chief Technology Officer of KnowNow, itself a commercialisation of technology originally developed for mod_pubsub. I noticed this paper through a Danny Ayers blog entry. I have an interest in publish subscribe systems and this paper lays the theoretical groundwork for the behaviour of such systems on high-latency networks such as the internet. Rohit's thesis weighs in at almost three hundred pages, including associated bibliography and supporting material. As such, I don't intend to cover all of my observations in this blog entry. I'll try to confine myself to the question of consensus on high-latency networks and on the "now" horizon.

One of the first things Rohit does in his thesis is establish a theoretical limit to the latency between two nodes and the rate at which data can change at a source node while being agreed at the sink node. He bases his model around the HTTP concept of expiry and calculates if it takes at most d seconds for the data to get from point A to point B, then A must hold its value constant until at least d seconds and note this period as an expiry in the message it sends. Clearly if A wants to change its value more frequently the data will be expired before it reaches B. This result guides much of the subsequent work as he tries to develop additional REST-like styles to achieve as close as possible to the theoretical result in practice. He ends up with A being able to change its value about every 2d if appropriate architectural components are included.

Two times maximum latency may not be terribly significant on simple short distance networks, however even on ethernet d is theoretically unbounded. It follows then that over a network it isn't possible to always have a fresh copy of an origin server's value. This is where things get interesting. Rohit defines the concept of an "estimated" system. He contrasts this from a centralised system by saying that an estimated system does not guarantee consensus between network nodes. Instead of saying "I know my copy of the variable is the same as the server holds, and thus my decisions will be equivalent to everyone else's", we say "I have a P% chance that my copy is the same as the server holds, and thus my decisions may differ from those of other network nodes". A "now" horizon exists between those components that have a low enough latency to maintain consensus with the origin server's data and those that are too far from the source of data to track changes at the rate they occur.

In SCADA systems we measure physical phenomenon. Even if we only sample that phenomenon at a certain rate, the actual source change can occur at any frequency. If we consider the device that monitors an electric current (our RTU) as the origin server we could attempt to synchronise with the RTU's data before it resamples. If latencies are low enough we could possibly indicate to the user of our system the actual value as we saw it within the "now" horizon and thus in agreement with the RTU's values. However, another way to see the RTU is as a proxy. Because we can't control the frequency of change of the current, the "now" horizon is arbitrarily small. All SCADA systems are estimated rather than consensus-based.

Rohit doesn't directly cover the possibility of an infinitely-small "now" horizon, however his ultimate architecture for estimated systems generally is ARREST+E (Asynchronous Routed REST+Estimation). It contains a client that retries requests to the field device when they fail in order to traverse unreliable networks. It includes credentials for the client to access the field device. It includes a trust manager that rejects bad credentials at the field device, and includes the capability to drop duplicate messages from clients that have retried when a retry wasn't required. The field device would send updates back to the client as often as the network permitted. To keep latency low it would avoid passing on stale data and may perform other summarisation activities to maximise the relevance of data returned to the client. On the other side of this returned data connection the client would include a predictor to provide the best possible estimate of actual state based on the stale data it has at its disposal. Finally, this return path also includes credential transmission and trust management.

This grand unifying model is practical when the number of clients for our field device is small. Unlike the base REST and HTTP models this architecture requires state to be maintained in our field device to keep track of subscriptions and to provide appropriate summaries. The predictor on the client side can be as simple as the null predictor (we assume that the latest value at our disposal is still current) or could include domain-specific knowledge (we know the maximum rate at which this piece of equipment can change state and we know the direction it is moving so we perform a linear extrapolation based on past data input). As I mentioned in my previous entry state may be too expensive to maintain on behalf of the arbitrarily large client base that exists on the Internet at large. There must be some means of aquiring the money to scale servers up appropraitely. If that end of things is secure, however, I think it is feasible.

Importantly, I think it is necessary. In practice I think that most mutable resources on the web are represnting physical phenomenon of one kind or another. Whether it is "the state the user left this document in last time they wrote it" or "the current trading price of MSFT on wall street", people are physical phenomenon who can act on software at any time and in any way. The existing system on the web of using expiry dates is actually a form of estimation that is driven by the server deciding how often a client should poll or recheck their data. This server-driven approach may not suit all clients, which is why web browsers need a "reload" button!

I'll talk some specific issues I had with the thesis in a subsequent post.

Benjamin

Wed, 2005-Sep-21

State Tradeoffs

HTTP is stateless between requests, each of which involves stateful TCP/IP connections. Any subscription protocol requires state to be retained in the network for the lifetime of the subscription to keep track of who the subscribers are and what they want. Whenever I POST to a web server I can be pretty confident that I'm creating some kind of state, although how long-lived the state is or whether the state has associated costs in terms of memory utilisation or other resources is unkown to me. REST sides with HTTP and says we should be stateless beween requests, but is that always possible and what are the tradeoffs?

Mark Barker points to Fielding's disseration. Under the Related Work heading alongside the discussion of preexisting architectural styles he said this:

The interaction method of sending representations of resources to consuming components has some parallels with event-based integration (EBI) styles. The key difference is that EBI styles are push-based. The component containing the state (equivalent to an origin server in REST) issues an event whenever the state changes, whether or not any component is actually interested in or listening for such an event. In the REST style, consuming components usually pull representations. Although this is less efficient when viewed as a single client wishing to monitor a single resource, the scale of the Web makes an unregulated push model infeasible.

He discusses Chiron-2 as an example EBI style.

This opinion is interesting to me because I can perhaps see for the first time the difference between "corporate" and "internet" scales of software. If Fielding is to be believed, that difference is captured in exactly how much state a server can afford to maintain on behalf of its clients. In computer science courses students are taught that polling on the network is and obvious evil. It increases response latency, increases network bandwidth, or both. When thinking of the cost of network in terms of only the bandwidth and latency characteristics of the network the urge to use a publish subscribe scheme rather than polling is strong. Fielding gives us a different way to think about the problem.

I come from the SCADA world. Typically, a stable set of client and server applications are running on a secure or relatively-isolated network. Extreme load conditions usually come about when equipment being monitored changes state frequently and servers must process and pass on data under what is termed an avalanche condition. Varying quality constraints apply to different kinds of client in this scenareo. Historical storage of change records seeks to store as many interim states as possible without lagging too far behind the change records of right now. User-oriented services typically seek to have the very freshest data available at all times. Low latency and effective use of bandwidth are critical to making this work, as is the processing performance of actual clients.

Extreme load conditions on the web usually come about from client activity, and is termed in today's internet, the slashdot effect. Instead of thousands of changes being processed and passed on to clients, thousands of clients are either "just having a look" or actively participating in your web site mechanics. It's not just Fielding who things that state is a barrier to high performance under these condtions. Paul Vixie wrote the following earlier this year on namedroppers@ops.ietf.org:

tcp requires state. state quotas are a vulnerability in the face of a DoS attack. tcp must remain a fallback for queries, and be reserved in the common case for updates and zone transfers. doesn't matter who runs the servers, if they're supposed to stay "up" then you can't depend on them to have room for another tcp protocol control block in the common case.

The sentiment is periodically echoed through the mailing list's lifetime. Paul is also the author of this earlier quote:

speaking for f-root, if we had to build a protocol control block for each of the 5000 to 10000 queries that came in every second, and exchange packets between our state machine and remote state machine, before finally sending the response and reclaiming the state-memory, we'd need a lot more hardware than we have today. it's not the bytes, or the packets-- those we can provision easily. it's the state-memory occupancy time that would dominate.

So in the extreme world of DNS, even the state associated with a TCP/IP connection is considered too much. In Fielding's view of the web anything more than TCP is too much. Scaled back down to a corporate setting and state is no longer the problem. Bandwidth and latency dominate. Clearly there is a sliding scale and there are tradeoffs to be made. Where do the tradeoffs lie?

When we're talking about real-time scada-like data clients have a cost if they aren't kept up to date. This applies to any time-sensitive data ranging from stock market quotes to eBay auction prices. To reduce the cost of getting information late clients must poll the service, incurring bandwidth and node procesing costs. Memory resources aren't used past those required to carry on TCP/IP comms between relevant network nodes, so a large quantity of requests can be handled with the main costs being processing and bandwidth. Those can be limited explicitly by the server, so it should be possible for the server to ride out the storm while clients see longer and longer response times during peak demand. The cost of bandwidth and processing can be traded off for extra memory resources by offering a publish/subscribe state synchronisation model to clients. This provides low-latency return of data to clients without polling and may reduce both processing and bandwidth costs. It still allows those costs to be managed by the server infrastructure, which may choose to drop intermim state updates or perform other load shedding activities. It does cost extra memory, though. Unlike bandwidth and processing, memory isn't a renewable resource. If you expend it all your server can't recovery by slowing the rate of requests or responses. Your server is likely to simply fall over.

State is sometimes necessary. If I book a flight online the flight booking state must be retained somewhere, or my booking won't be honoured. In cases like these I pay money to store the state so the accounting infrastructure exists to fund memory expansion. Even if the cost is amortized against actual successful bookings, the fact that the site is doing business mitigates the raw "interesting site" effect where thousands or users may decide to store small amounts of state on your server if they are allowed to. Importantly, this example also has an approximate limit to the amount of state stored. The number of planes in the sky at any one time controls the number of seats available. Even if a proportion of the bookings will have to be cancelled and remade and the state of those changes recorded that proportion should be roughly predictable on the large scale as a percentage of actual seats booked or available.

Other examples include WikipediA and del.icio.us. Both are open to the slashdot effect of having a large number of clients. As well as doing stateless queries, these services allow users to store encylopedia articles or bookmarks on the sites. They must have storage capacity in proportion to the size of their user base, based on the average number of bytes a user contributes. Wikipedia has a small advantage over other sites in that once an article is written and authorative it seems likely that a proportion of the user base is cut out of contributing to that topic because everything they know about it has already been written down. I suspect delicious doesn't get this effect. In fact, my experience is that bookmarks found via delicious are often inserted back into delicious by the discoverer. This suggests to me that the storage requirements for delicious are in rough proportion to the user base but with a risk that it will eventually grow in rough proportion with the size of the web itself. Is such a service sustainable? Will its funding model allow for this possibly infinite growth in resource consumption requirements?

To answer that question, let's take a look at a bigger example. Google certainly seems to have been able to make a go of it so far. Their storage requirements follow the same curve but with a much larger constant. Instead of storing a bunch of URLs and the people who like them, google is effectively a full-text index of the internet. Other projects of that scale exist also, within and without the search engine manifold. The key seems to be ensuring your funding model grows along the same curve (or a steeper one if you can manage it). If you have storage needs in proportion to the size of an internet user base then you probably need to get funding proportional to that size. This is important in paying for bandwidth also. You need to get paid in proportion to how popular you are or you aren't going anywhere on the Internet. It seems that for the moment delicious is working from a venture capital pool, but at some point it will have to start returning money to its investors at a greater rate than the cost of scaling up the servers to handle load.

I don't know whether subscription is a viable concept on the web when the number of users is effectively unbounded. Servers can still reject subscriptions or refuse to renew them so a bounded number can be maintained... but you'll be giving a proportion of your user base error messages and that is never a good look. In the end it probably comes down to a matter of economics. Are you being paid in proportion to the number of subscriptions you provide? Is there an advertising stream alongside the subscription? Are users paying for the subscription itself, perhaps as a premium service? If you can get that side of things off the ground and are ready to scale your servers as your userbase grows and spikes (or in anticipation of the spike) then you can probably make a go of subscription. Otherwise you might be better off letting your server be polled and see your web site respond more slowly to users in moderate overload conditions rather than just cutting them out. In the end the big overload conditions will still cause errors to be returned, so it is really a matter of how much you can afford to spend and what it will cost you if clients don't see your offering as reliable.

Benjamin

Sun, 2005-Sep-18

The RESS[sic] architectural style

I've just read an email from Savas Parastatidis on the RESTfulness or not of AJAX. Savas perhaps has an agenda in his writing that REST is not sufficient to do what industry really wants to do with the internet. Together with Jim Webber he's been working on MEST (formerly ProcessMessage). MEST is an attempt to use REST ideas to give SOA and SOAP generally more life and longer legs.

A lot has been invested by various companies into SOAP infrstracture, so it is reasonble that those who have been backing that horse continue to try and make it work. History is full of differing technical solutions ultimately merging rather than supplanting each other entirely. It makes sense to think that what might come out of the current mix of web service styles is something that is neither REST nor SOAP, and MEST is an attempt to be that something.

Overall, I'm unconvinced by SOAP. My comment via return email was as follows:

In the end the constraints of REST are that everyone understands your messages/media-type/whatever. If that constraint proves to be accurate on the internet then it won't be possible to reach all users of the internet without a general-purpose and non-specific message set. This tension between explaining specifics sufficient for the client to be able to reason about message content and describing things generically enough that all clients can render or otherwise process the message is a difficult one to resolve. The more specific you get the better your target audience understands you, and the smaller your target audience can possibly be.

I think that trying to nail down schema issues and deal with very specific message types is an impossible mission on the internet where you don't control and distribute the client software your users have installed. Ultimately I think that on the internet scale you need to give your client a document of standard media type to work with. If that document explains how to fill out the very complex and specific form you need submitted back to you, then that's great. Just don't expect client software to understand how to do it unless you provide those necessary hints. Don't expect client software to be written to your API. Write AJAX, or XForms to make that happen.

Thinking about different architectural styles further, I made this comment:

Personally, when I think of software architecture the most important issues that come to mind are that of client and server. The client wants something done, and the server is capable of doing it. The client is responsible for acting or retrying or whatever is necessary until the server acknowledges completion of the activity or moves into a state consistent with compeltion of the activity. To me distributed software achitecture is primarily about synchronisation of that state, which comes back somewhat to pub/sub. You request something be done. You get an ack indicating that your request has been accepted, and you monitor some state on the server side until it changes to what you want it to be. This view of architecture is the reason I keep thinking about and worrying about the lack of a pub/sub system that I'm confident works on the internet.

I hereby declare the existence of another architectural style. I don't have a good name for it yet. RESS - REpresentational State Synchronisation or just SYNC, maybe. It is designed to support complex requests but not encourage complex message interactions. It is designed to put power into the hands of clients to determine when and under what conditions their request can be said to be complete. The constraints of this style are as follows:

  1. State and is synchronised from server resources to client according to standard interaction patterns
  2. Requests and synchronised state use standard media types
  3. Requests made from client to server resources are constructable from data available to the client

The following axis are not constrained:

  1. Agents may be included to act on behalf of a client. The client may monitor the agent or the origin server to assess completion of the request and compare the result against success criteria.
  2. Caches can (and should) be part of the synchronisation pattern
  3. I'm not yet prepared to constrain client to using standard methods. I think that the method used and the media type should be decided by the server, typically by providing an XForm to the client that indicates how to build the request the client wants.

It is currently pretty-much REST. I've weakened the constraint of using standard methods because I'm not convinced that the POST-abuse path is any better... but maybe I can be talked around on the practicalities of this point. The main difference is a focus on state synchronisation rather than just state transfer. I believe that clients need reliable subscription capabilities in order to assess whether their goals have been achieved without polling. This is an asynchronous view of interaction and I see the synchronisation model as a secure underpinning of all possible designs that require asynchronous functioning. Focusing on subscription of resource state rather than building a generic delayed response approach allows caches to get in there and give clients quick answers to what the server's state was just moments ago. This should provide the architectural benefits that REST purports to have without polling (assuming your synchronisation mechanism is not polling-based). When other operations are performed on the resource in question caches along the way should recheck the resource state before passing it on to clients as the current state.

I think the refocus on what clients really want to do with state most of the time (subscribe to it, rather than transfer it) is useful, and allows us to rethink the existing caching mechanism of the internet rather than holding it up at the reason we do things the way we do in REST. I think there is a lot more joy to be had with a "real" synchronisation mechanism for those corners of real application funcationality that don't change at reliable rates and those requests that aren't synchronous.

Benjamin

Sat, 2005-Sep-10

HTTP Subscription without SUBSCRIBE

I've been very serouslying considering dropping the only new http verb I've introduced for HttpSubscription: SUBSCRIBE.

I'm coming around to the idea of saying this is simply a GET. I've heard opinions that it should be essentially a GET to a "subscribable" version of a resource. I haven't quite swung that far around. In particular, I think that separating the subscribable and the gettable version of the resource is going to make things harder on clients. In fact, I think it doesn't make a lot of sense from the server side either. If this is a resource that is likely to change while the user is looking it it, it should offer either subscribe support or a refresh header indicating how often to poll the resource. To offer subscribe but only a single-shot GET doesn't really make sense. Therefore, insted of using a special SUBSCRIBE request I'm thinking along the lines of content negotiation on a GET request. SUBSCRIBE made sense when I thought we would be "creating" a subscription. In that world SUBSCRIBE was a good alternative to POST. Now it's looking like an alternative to GET, I think that is less appropriate.

As I mentioned in my previous entry, bugzilla uses mozilla's experimental multipart/x-mixed-replace mime type to put up a "wait" page before displaying the results of a large query. Mozilla browsers see this document, render it, then render the result section of the multipart replace so the user sees that results page they are after. Bugzilla doesn't use this technique for browsers that don't support it. How does bugzilla know? It looks at the user-agent header!

I think this is clearly unacceptable as a general model. New clients can be written, and unless they identify themselves as mozilla or get the bugzilla authors to add them to the "in" list bugzilla won't make effective use of their capability. I see this as a content negotiation issue, so it seems reasonable to start looking at the Accept header. It is interesting to note that mozilla doesn't explicitly include multipart/x-mixed-replace in its accept header. Instead, it contains something like this:

Accept: text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, 
        text/plain;q=0.8, image/png,*/*;q=0.5

Only the '*/*' permits the return of x-mixed-replace at all. There may even be some doubt about exactly how to apply the accept header to multipart mime of data. If I accept a multipart mime does that mean I don't have any control over content below that level? The rfc is strangely quiet on the issue. I suspect it has just never come up.

In practice, I think a reasonable interpretation of accept for multipart content types is that if the accept contains that type and also other types then the accept continues to apply recursively down the multipart tree. If you say you accept both multipart/x-mixed-replace and text/html, then chances are the document will come back as plain text/html or text/html wrapped up in multipart/x-mixed-replace. An application/xml version of the same is probably not acceptable. Also in practice, I think it is necessary at the current time to treat mozilla browsers as if they included multipart/x-mixed-replace in their accept headers.

Benjamin

Sat, 2005-Sep-10

More HTTP Subscription

I've been actively tinkering away at the RestWiki HttpSubscription pages since my blog entry last and have started to knock some of the rough edges off my early specifications. I've begun coding up prototype implementations for the basic subscription concepts also, and have written up a little experience from that activity. I still have to resolve the issue of subscribing to multiple resources efficiently, and am currently proposing the idea of an aggregate resource to do that for me.

I'm hoping that what I'm proposing will eventually be RESTful, that is to say that at least parts of what I'm writing up will hit standardisation one day or become common enough that de jour standardisation occurs. I've been hitting more writeups of what people have done in the past in these areas and have added them to the wiki also.

There are various names for the basic technique I'm using, which is to return an indefinite or long-lived HTTP response to a client. There's server push, dynamic documents, pushlets, or simply pubsub. The mozilla family of browsers actually implements some of what I'm doing, at least for simple subscription. If you return a HTTP response with content-type multipart/x-mixed-replace, then each mime part will replace the previous one as it is received. This is a very basic form of subscription, and could be used for any kind of subscription really. That's the technique used by bugzilla to display a "wait" page to mozilla clients before returning the results of a long query. The key problems are these:

At the moment it seems we need to savagely hit the cache-control headers in order to prompt proxies to stream rather than buffer our responses to requests. If caches did understand what was going on, though, they could offer the subscription on behalf of the origin server rather than acting as a glorified router. A proxy could offer a subscription to a cached data representation, and separately update that representation using subscription techniques. This would give us the kind of multicasting for subscriptions as we currently get for everyday web documents.

Scaling up remains a problem. Using this technique, one subscription equals one TCP/IP connection. When that drops the subscription is ended, and if you need more than one subscription you need more than one connection. If you need a thousand subscriptions you need a thousand connections. It isn't hard to see how this might break some architectures.

My proposal to create aggregate resources is still a thorny one for me. I'm sure it would help in these architectures but there are issues to consider about what wrapping many http responses into a single response means. If aggregates can effectively be created, though, you could get back your one connection per client model for communications.

I'm eager to get more feedback on this topic, especially if you are developing software for my specification or are using similar techniques yourself. I have a vauge notion that in the longer term it will be possible for a client to be written to a HTTP protocol that explicitly permits subscription and prepares clients for the data stream they will get back. As I said earlier there are implementations already out there, but the data on the wire is in a myriad of forms and I see the lack of consistency as an opportunity to get it right.

Benjamin

Sun, 2005-Sep-04

HTTP Subscription v3

This is my third time around the mulberry bush with HTTP subscription, and I've been lucky to have had contact with a few people on the subject now. I've even seen working code, and an internet-scale demonstration. Unfortunately I'm unable to link the the demo at the request of its author who doesn't want a drowned server :)

In accordance with what has been developing, I have transferred some of my current thinking to the RestWiki. This site appears to be a little neglected so I've be starting to nibble around the edges to try and improve things. If I don't meet any resistance soon I may start really getting into the content and pushing my own view of REST and its variants.

I've updated the FAQ and intend to do some more pruning and updating after giving anyone who is actually still working on the site time to reject my input. I've also commented on an existing HTTP Subscription Specification, and most exciting of all I've collated my current thinking on HTTP subscription into a canonical resource that I intend to keep updating.

This URL will be the base of operations for any further updates and should avoid the fragmentary approach that developing specification via blogging tends to encourage. I have split the specification in a main introductary page, A page on the necessary and desirable features of a HTTP subscription protcol, and a page for the specifcation itself.

I have allowed room at the bottom of the requirements and specification pages for general comments.

Benjamin

Tue, 2005-Aug-30

Arbitrary Methods in HTTP

Thanks to Robert Collins for your input on my previous blog entry about using the method name in a HTTP request as effectively a function call name. Robert, I would have contacted you directly to have a chat before publically answering your comments, but I'm afraid I wasn't able to rustle up an email address I was confident you were currenly prepared to accept mail on. Robert is right that using arbitrary methods won't help you get to where you want to go on the Internet. Expecting clients to learn more than a vocublary of "GET" is asking a lot already, so as soon as you move past the POST functions available in web forms you are pretty much writing a custom client to consume your web service. The approach is not RESTful, doesn't fit into large scale web architecture, and doesn't play nice with firewalls that don't expect these oddball methods.

My angle of attack is really from one of a controlled environment such as a corporate intranet or a control system in some kind of industrial infrastructure. The problems of large scale web architecture and firewalls are easier to control in this environment, and that's why CORBA has seen some level of success in the past and SOAP may fill a gap in the future. I'm not much of a fan of SOAP, and the opportunities that dealing with a function call as (method, resource, headers, body), or to my mind as (function call, object, meta, parameter data) are intriguing to me. Of particular interest is of how to deal with backwards and forwards-compatability of services through a unified name and method space and the ability to transmit parameter data and return "return" data in various representations depending on the needs and age of the client software.

I'm also interested in the whether the REST approach (or variants of it) can be scaled down to less-than-internet scale, and indeed less-than-distributed scale. I'm curious as to what can happen when you push the traditional boundaries between these domains about a little. I think it's clear that the traditional object model doesn't work on the Internet scale, so to my mind if we are to have a unified model it will have to come back down from that scale and meet up with the rest of us somewhere in the middle. I think the corporate scale is probably where that meeting has to first take place.

My suggestion is therefore that at the corporate scale a mix of restful and non-restful services could cooexist more freely if they could use HTTP directly as their RPC mechanism. Just a step to the left is actual REST, so it is possible to use it wherever it works. A step to the right is traditional Object-Orientation, and maybe that helps develop some forms of quick and dirty software. More importantly from my viewpoint it might force the two world views to acknowledge each other, in particular the strengths and weaknesses possessed by both. I like the idea that on both sies of the fence clients and servers would both be fully engaged with HTTP headers and content types.

I'm somewhat reticent to use a two-method approach (GET and POST only). I don't like POST. As a non-cachable "do something" method I think it too often turns into a tunneling activity rather than a usage of the HTTP protocol. When POST contains SOAP the tunnelling effect is clear. Other protocols have both preceeded and followed SOAP's lead by allowing a single URI to do different things when posted to based on a short string in the payload. I am moderately comfortable with POST as a DOIT method when the same URI always does the same thing. This is effectively doing the same thing as python does when it makes an object callable. It consistently represents a single behaviour. When it becomes a tunnelling activity, however, I'm less comfortable.

Robert, you mention the activity of firewalls in preventing unknown methods passing through them. To be honest I'm not sure this is a bad thing. In fact, I think that hiding the function name in the payload is counter-productive as the next thing you'll see is firewalls that understand SOAP and still don't allow unknown function names passing through them. You might as well be up-front about these things and let IT policy be dictated by what functionality is required. I don't think that actively trying to bypass firewalling capabilities should be the primary force for how a protocol develops, although I understand that in some environments it can have pretty earth-shattering effects.

In the longer term my own projects should end up with very few remaining examples of non-standard methods. As I mentioned in the earlier post I would only expect to use this approach where I'm sending requests to a gateway onto an unfixably non-RESTful protocol. REST is the future as far as I am concerned, and I will be actively working towards that future. This is a stepping-off point, and I think a potentially valuable one. The old protocols that HTTP may replace couldn't be interrogated by firewalls, couldn't be diverted by proxies, and couldn't support generic caching.

Thanks to Peter Hardy for your kind words also. I'll be interested to hear your thoughts on publish/subscribe. Anyone who can come up with an effective means of starting with a URI you want to subscribe to and ending up with bytes dribbling down a TCP/IP connection will get my attention, especially if they can do it without opening a listening socket.

Benjamin

Sun, 2005-Aug-28

Service Discovery Using Zeroconf Techniques

I first became aware of the existence of Zeroconf some years back as a system for assigning IP addresses in the absence of any centralised coordination such as a DHCP server. It seemed designed for small networks, probably private ones. I already had my own setups in place to cover this kind of configuraton, so my interest was minimal. I've just stumbled across Zeroconf again, this time in the form of Avahi, a userland interface to zeroconf features. Even then, I didn't pay much attention until I hit Lennart Poettering's announcement that GnomeMeeting has been ported to Avahi. The mandatory screenshot shook me out of my assumption that I was just looking at a facility to assign IP addresses. The most important feature of Avahi is service discovery, and all using our old friend the SRV record combined with Multicast DNS (mDNS).

Well, it's nice to know that other people out there are trying to solve the same problems as you are. It's even nicer to hear they're using standard tools. The DNS Service Discovery page contains a bunch of useful information and draft proposals to try and solve just the problems I have been thinking about in the service discovery sphere.

For the kind of infrastructure I work with the main requirement of service discovery is that I can map a name not just to an IP address but to a port number. That's solved by SRV records, so long as standard clients and libraries actually use them. The main architectural contraint I work with is a requirement for fast failover once a particular host or service is determined to have failed. A backup instance must quickly be identified and all clients change over to the backup as soon as possible, without waiting for their own timeouts. This seems to be partially addressed by pure mDNS, as changes to the DNS records are propageted to clients immediately via a multicast UDP message. Unfortunately such messages are unreliable, so it is possible that an portion of the network from zero clients to all clients will miss hearing about the update. Polling is required to fill in the gaps in this instance. Alternatively, the DNS-SD page points to an internet-draft that specfies a kind of DNS subscpription over unicast. This approach parallels my own suggestions about how to add subscription to the HTTP protocol. In fact, it would be viatally important to be able to detect server failure efficiently whenever HTTP subscription was in operation and a backup available. If no such mechanism was in place any HTTP subscription could silently stop reporting updates with noone the wiser that their data was stale.

The parallels between DNS and HTTP are interesting. Both are based on fetching data from a remote host through a series of caches. Both have evolved over the years from simple request/response pairs towards long lived connections and pipelining of requests and responses. I hope that this new DNS-subscribe mechanism gets off the ground and can clear the way for a later HTTP subscription mechanism based on the same sorts of approach. Another hope of mine is that the Avahi project is successful enough to force software developers to make service discovery an issue or be trampled in the herd. The DNS work in this area is mostly pretty good and certainly important to people like me. The server side supports SRV, but the clients are still lagging behind. Additionaly, interfaces that should be appliable to SRV like the getaddrinfo but are not yet using SRV records should be updated where possible. A file should be added to /etc (perhaps /etc/srv) as part of the name resoloution process to support non-DNS use of these facilities without changing code.

I feel that one of the reasons that SRV has not been widely adopted is the complexity not in the initial lookup, but in the subsequent behaviour required after a failed attempt to contact your selected server. Clients are required to try each entry in turn until all fail or they find one that works. In terms of pure software design, it can be hard to propagate an error that occurs in trying to connect to a service back to the code that performed the name lookup to do so. You have to store state somewhere as to which you've tried so far. That's tricky, especially when your code dealing with DNS and service lookup is already horribly complicated. I don't really know whether this kind of dynamic update for DNS would make things better or worse. In one sense things could be more complicated because once DNS says that a service is failed you might want to stop making requests to it. On the other hand, if DNS can tell you with reasonable accuracy whether a particular instance of the service is alive or dead you might be able to do without the state that SRV otherwise requires. You could work in a cycle:

  1. Lookup name. Get a non-failed instance based on SRV weightings.
  2. Lookup service. If this fails either report an error, or go back to (1) a fixed number of times

There is obviously a race condition in this approach. If the service fails after the initial lookup or the notification of failure doesn't propagate until after the lookup occurs then looking up the service will fail. Alternatively, a service instance might be restored but still appear to be failed and thus noone will try to connect to it. Additionally, these constant failed and not failed indications propagating across the internet could have detrimental effects on bandwidth. On the other hand, they may prevent more expensive HTTP requests that would otherwise take place. Also, this is not the approach you would take with subscription. Preferrably, you would keep constant watch over whether your selected service instance was still within the living set and whenever it ceased to be so drop your subscriptions and select a new instance of the service. This may add complexity back into the chain, but I still think that relying only on the DNS end of things to determine who you talk to rather than a combination of DNS and the success or failure of attempts to actually reach a service instance would make things simpler overall.

I think this would be an interesting area for further research. I have good feelings about how this would work on a small network with multiple services per host and with multiple instances of your services spread around the network. There may also be benefits at the internet level. Chances are that unless there are internet-scale advantages we'll continue to see existing client libraries and programs failing to really engage with the idea.

Benjamin

Sun, 2005-Aug-28

HTTP Function Calls

I've been delving pretty deeply into HTTP lately, implementing such things in a commercial environment mostly from scratch. Interfacing to existing systems with a REST leaning is interesting, and when you're viewing things from the low levels without reference to SOAP and other things that have come since HTTP you get to see what HTTP is capable of out of the box. In particular, I think you get to see that even if you aren't a RESTafarian SOAP is probably the wrong way to approach applying functions to HTTP-accessable objects.

Standard methods

HTTP is built around some standard methods, and a subset of these are the classic methods of REST. REST proponenents usually try to confine themselves to GET, PUT, DELETE, and POST. rfc2616 specifies OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, and CONNECT. Different versions of the specification have different method lists. rfc1945 covering HTTP/1.0 only includes the GET, HEAD, and POST methods. The earlier HTTP/1.1 rfc 2068 didn't include CONNECT but did include descriptions for additional PATCH, LINK, and UNLINK methods. WebDAV adds PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, and UNLOCK. REST is really based not around the four traditional verbs but the total defined set of methods that are commonly understood. The use of anything that is used and standardised and agreed upon is RESTful. If we go back to the Fielding dissertation we read the single sentence that sums up REST:

REST enables intermediate processing by constraining messages to be self-descriptive: interaction is stateless between requests, standard methods and media types are used to indicate semantics and exchange information, and responses explicitly indicate cacheability.

So long as we use standard methods we are being RESTful, and standard methods are any that have been agreed upon. Methods that have been agreed upon can be processed by caches in standard ways, and making caches work is the most significant architectural constraint that goes into defining the REST approach.

In fact, extensions can be added arbitrarily to HTTP. Any method not understood by a proxy along the way causes the proxy to mark the affected resource as dirty in its cache but otherwise to pass the request on towards an origin server. I think the simplest way to map function calls onto HTTP resources is to use the function name (or a variant of it) to the HTTP method rather than including it in the HTTP body. This is made a little tricky by two factors. The first is that if anyone ever does come along and define a meaning for your method then caches might try and implement that meaning. If you're lucky they'll still only treat it as a cache clearing operation and a pass-through. On the other hand you might not be lucky. Also, new clients might come along expecting your method to behave according to the new standard semantics and cause further problems. Methods are effectively short strings with universal meaning. Dan Connolly has this to say:

There is a time and a place for just using short strings, but since short strings are scarce resources shared by the global community, fair and open processes should be used to manage them.

So to make the best of it we shouldn't be using methods that might clash with future meanings. I suggest that using a namespace in the HTTP request method would eliminate the possiblity of future clashes and make the HTTP method something more than the wrapper for a function call. It can be the function identifier.

URIs as methods, giving up "short strings"

The second issue arises from the first: Exactly how do we do this? Well, according to rfc2616 any extension method just has to be a token. It defines a token as any character except for those in the string "()<>@,;:\\\"/[]?={} \n\r". This rules out a normal URI as a method. The URI would at least need to be able to specify slash (/) characters to separate authority and path. I suggest a simple dotted notation similar to that of java namespaces would be appropriate. Now we are no longer being RESTful by heading down this path. We aren't using standard methods anymore. On the other hand, perhaps it is beneficial to be able to mix the two approaches every once in a while. Perhaps it would help us to stop overloading the POST method so badly to make things that don't quite fit the REST shape work.

Practical examples, and the spectrum of standardisation

Your new method might be called ReloadConfiguration.example.com. It might even be tied to a specific class, and be called ReserveForExecution.control.example.com. If your method became something that could be standarised it might eventually be given a short name and be included in a future RFC, at which time the RFC might say that for compatbility with older implementations your original dotted name should be treated as being identical to the short name.

Mostly-RESTful? Mixed REST and non-REST.

I think that author James Snell largely gets it wrong in his IBM developerworks article on why both REST and SOAP have their places. I think his view of REST is a bit wonky, but to lift things up to a more abstract level I think he has some interesting points. He seems to think that some operations can't easily be translated into state maintained as resources. He thinks that sometimes you need to trigger activities or actions that don't work in a RESTful way. Mostly, he want to be able to employ traditional gof-style OO patterns to his distributed architecture. I don't find myself agreeing with any of his examples but I do find myself in trying to retrofit REST onto an existing architecture not wanting to take the big hit all at once. What I want is a migration path. Even when I get to the end of that path I think there will still be some corners left unRESTful. There are places where one application is already a converter to another protocol. That protocol isn't RESTful, so it seems unlikely to me that putting a RESTful facade over the top of it will ever do any good. That protocol is also an industry standard, so there's no chance of tweaking it to be more like we might like in our RESTful world.

What I'm suggesting is that by supporting non-RESTful concepts alongside RESTful ones it becomes possible to shift things slowly over from one side to the other, and find the right balance where the fit isn't so good for REST.

Once you start heading down this path you see some interesting features of HTTP. After allowing you to select the function to call and the object to call it on HTTP allows you to provide a body that works like the parmaeter list, and extra metadata which you might use along the way in the form of headers. Now that we have cleared the method identifier out of the body it can be decoupled from its content. We can effectively have polymorphism based on any of the metadata headers, such as content-type. We can handle multiple representations of the same kind of input. We can also introspect sophisticated content types such as XML to determine how to do further processing. In my case I found that I had to map onto a protocol that supported string positional parameters. For the moment the simplest fit for this s csv content. Just a single CSV line supports passing of positional parameters to my function. In the future as the underlying protocol changes shape to suit my needs I expect to support named parameters as an XML node. Eventually I want to push the HTTP support back to the server side, so any resource can decide itself how to handle content given to it.

Conclusion

I hope that HTTP will support future software revisions that are more and more RESTful into the future, but for the moment can concentrate on supporting necessary functionality without a massive flow-on impact.

Benjamin

Sat, 2005-Aug-20

Disposable Software

I just read these articles published by Ryan Tomayko's lesscode site on disposable software and the fallacy of resuse. They immediately strike a chord with me. Especially when you're talking about inexperienced software developers it is easy to go down the path of overengineering with a justification of "it'll make implementation of the next set of requirements easier" or "it'll make it more generic". The last few years of agile development methodologies have been working to belie these arguments. Indirection is not abstraction.

When you write more code to do the same thing,
when you put a wrapper in front of a perfectly servicable abstraction or concrete class,
when you hide what is really going on in order to make your code "higher level",
when you write software you don't have requirements for yet,
when you prepare the way for the coming king,
when you code a generic design before you have two working concrete designs and three sets of concrete requirements,
you make it harder to understand,
you make it harder to maintain,
you make it more costly both now and into the future.

Software is like a user interface. The fewer concepts your users have to be aware of the more likely they'll be able to write software that uses yours. The fewer concepts you have in your code the more likely someone else will be able to review and maintain it.

Reusuable software is not a valid goal in and of itself because in the very act of creating a great reusable component you introduce another concept into your software, perhaps more than one. Your software is worse because of it. Unused reusability is actively harmful. Design what you need and go no further.

Your software is not your baby. It is not something you can hang onto. Be ready to throw it away and start again. The worth of your software is not in your software. The worth of your software is in you. The intellectual property of the company you work for is not what they gave you thousands of dollars to build. It is what they gave you thousands of dollars to become.

Benjamin

Sun, 2005-Aug-14

Types and Contracts

I've just come across this May article by Michael Ellerman. Michael responds to some inflammatory language I used in my own response to a Joel Spolsky article. I obtusely referred to "real work" and my doubt that python is suited. Both my article and Michael's are centred around the concepts of type and of Design By Contract. I have since written about type in the context of the REST architectural style.

Compile time or runtime?

My feelings are still mixed about the use of explict type hierarchies. On one hand I have years of experience in various languages that show me how type can be used by the compiler to detect bugs even when no test code has been written, or when the test code doesn't achieve 100% coverage of the input space. That capability to detect error is important because coverage is theoretically impossible for most functions. The gap between what is tested and what is possible must be covered to give any confidence that the code itself is correct. Stop-gaps have traditionally come either in the form of manual code review or automatic type-supported compiler checks. On the other hand, it is clear that even a carefully-constructed type hierarchy isn't capable of detecting all contract breeches at compile time. Function preconditions must still be asserted, or functions must be redefined to operate even in cases of peculiar input. The assertions are run time checks in languages I'm familar with, and it may not be possible to convert all of them to compile time checks.

Valid and invalid function domains

It should always be possible to define the valid input space to a function. Proving that the actual input is a member of the set of valid input is typically harder. Type hierarchies are a tool to tell the compiler that the appropriate reasoning has been done by the time it reaches the function. Unfortunately, the complexity of a type system that could effectively define the domain of every function in your program is so great that implementing things this way may cripple your programming style. You would definately be taking a step away from traditional procedural approaches and moving firmly towards a mathematical or functional view of your application. These world views have traditionally been more difficult for developers to grasp and work effectively in, except perhaps in the especially clever among us.

The other thorn in the side of type systems is actual input to the program. It typically comes from a user, or perhaps a client application. The process of converting the data from raw bytes or keystrokes into useful input to a complex typing system can be quite tricky. Your program has to examine the data to determine which type best reflects its structure, and this must take into account both what it is and how it will eventually be used. It is trivially provable that any function that accepts unconstrained input and tries to construct objects who's types match the preconditions of another function must be runtime checked for as far as the unconstraned input is a superset of the legal constrained input. Because it is important to know how data will be used before using it these runtime-checked functions are likely to appear throughout your actual program. The more I/O connected your program is, the more you will have to deal with runtime checking.

Bucking the type hierarchy

I believe that REST and a number of other approaches which reject planned type hierarchies are founded in this area where handling of data from outside the program is difficult or impossible to isolate to a small part of the program's code. When this is the case the use of formal typing may not add any value. Just apply the same paradigm everywhere: Deal with a very basic fundamantal type and allow the data to be rejected if required. Handle the rejection in a systematic way. To me, the use of exceptions is a hint that this approach was brewing in headline Object-Oriented languages long before the pythons and RESTs of this world found a mass appeal. Exceptions are thrown when input to a function is not with the function's domain, but when that condition is not necessarily an indication of error within the program. If the error occured because of faulty input outside the program's control it makes sense to throw and catch exceptions so that the original input can be rejected at its source. The alternative would be to deal with the error wherever it occured or to specifically structure your program to include error-handling modules or classes. Within the classical strongly-typed O-O system without concerns for external input exceptions are evil, but when used to handle faulty external input consistently they can be a force for good.

The SOAP Intervention

Like CORBA, SOAP attempts to extend the Object-Oriented programming model across programs. This allows servers to treat input data not as streams of bytes but as fully formed structures and object references with their place in the type hierarchy already assigned. Various attempts have come and gone to actually serialise and deserialise objects between processes. These approaches should make it possible to eliminate significant network-facing handling of very basic types. Now, if only we could make the user start entering serialised objects as well... ;)

These approaches are interesting. They fail to some extent because of the assumptions they tend to make about network latency being small. They fail somewhat because they don't adequately support concepts of caching and of redirecting requests. On the other hand, I think there is some evidence these days that they fail simply because they have a system formal type hierarchy. If a programming language is like any other program, it benefits from being able to do what the user wants without the user having to understand too many concepts. These systems are built around the idea that the large organised way we have written software in the past is the correct one, and that all we need to do is expose this programming model on a larger scale. I think it was the success of Visual Basic that first saw this world view really shaken. It turns out that when you allow the average bloke off the street to write software the software he writes is pretty simple. It may have less stringent performance and safety requirements to what we professionals believe we have. It may be smaller and less sophisticated than we tend to come up with. On the other hand, it works. It does a job that that guy certainly wouldn't have hired us to do for him. It scratches an itch and empowers him. I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.

Conclusion

Michael proposes the use of assert isinstance(s, SafeString) to do type checking in python. I'm not sure this is really "good python", which normally focuses on what an object can do rather than what type you've been able to tag it with, but that's an aside. This isn't as useful as the type checking in C++ or Java because it is performed at runtime and only triggers when the input is not what was expected. He points out that C++ does no better on more complex contract provisions such as "i < 3", which must be made a precondition. My original point with respect to this issue is really about how far testing can take you on the path to program correctness. Using hungarian notation as Joel Spolski originally suggested may help a code reviwer determine program correctness, but in the python world there is no automatic genie to help you pick up those last few percentage of missed errors. In a large and complex program with incomplete test coverage my experience says that suitable compile-time type checking can find errors that code reviwers miss. On the other hand, if you're dealing with a lot of external input (and let's face it, we all should be these days) type may just not help you out at all. My feeling is that formal type hierarchies will eventually go the way of the cathedral as we just stop writing so much internally-facing code. My feeling is that we'll be much more interested in pattern matching on the input we do receive. Given python's use profile it is probably appropriate that input outside of function domains throws an exception rather than raising a runtime check, but in the domains I'm used to working in it doesn't seem that python is fundamentally different enough to or more advanced than the langauges we've seen in the past. Not so far advanced yet to warrant or make up for such a scaling back of tool support for compile-time checking. On the other hand I keep talking up REST concepts and the death of type to my own collegues. I guess you'd say that right now I'm still in the middle, and want to see more return for giving up my comfortable chair.

Benjamin

Sun, 2005-Aug-07

Internet-scale Subscription

When characterising an interface within a distributed system I ask myself a series of questions. The first two I ask together.

  1. Which direction is the data flowing, and
  2. Who is responsible for making this interface work?

I also ask:

The first is easy. It starts where the data is, and it ends where the data isn't. It's a technical issue. The second is more complicated and is focused around where the configuration lives and who is responsible for maintaining that configuration, especially configuration for locating the interface. The client is responsible for knowing a lot more than the server, although it may discover some of what it needs to know as it accesses the interface itself. Web browsers get all the information they need for navigation from links on a page, but even they need a home page or a manually-entered URL to get them going. Machine to machine interactions don't have the luxury of a human operator telling them which way to go so typically have fewer steps between their starting configuration and the configuration they use in practice to operate.

When data flows from client to server life is easy. The client can push, and if it's data doesn't get through it can just try again later. It can use bandwidth fairly efficiently even with the HTTP protocol. The use of pipelining allows the client to use available bandwidth efficiently so long as it is generaing an idempotent sequence of requests. Mixes of GET and PUT aren't idempotent (although each request itself should be), so they can cause stalls on the pipeline and reduce performance to a level that depends on latency rather than bandwidth. Depending on the client-side processing this may be able to be avoided altogether or contained in a single client function to avoid overall performance bottlenecks. This is important because it is easier to increase bandwidth than to reduce latency. Unfortunately it has something to do with the speed of light.

The problem is that data often flows both ways. You could reverse the client-server relationship depending on which way the data flows, and sometimes this is appropriate. On the other hand, you're sure to eventually need to push data from server to client. Today's Internet protocols aren't good for this. To reverse the client-server relationship in HTTP you need a HTTP server. That isn't hard. The hard part is opening the port required to accept the associated connections.

At present we have a two-tier internet developing. We have the high end established servers that can accept connections and participate in complex collaborations. We also have the low end of ad hoc clients trapped behind firewalls that will allow them to connect out but not allow others to connect back in. SOAP protocols are devised for the top tier Internet, and rely on two-way connectivity to make things work. I think ourt target should be the second tier. These clients are your everyday web browsers, and when data has to flow from a server to one of these web browsers we don't have any good established options open. Clients are reduced to polling in order to allow data to flow from server.

HTTP Subscription

The GENA protocol I mentioned previously is built for the top tier internet. On the bottom tier the following constraints apply:

That rules out everything so far proposed, I think. I have spent some time on this myself, though. I have a protocol that is regular HTTP and only connects out from the client. There are some other considerations I would like to see work:

The whole thing should be RESTful, so

Here is the closest I've been able to come up with so far:

The theory is that the subscription keeps track of whether new data is available at any time. When the NEXT request arrives it returns the data immediately if it is available. If new data isn't available it holds of replying until data is available. If a proxy is sitting in between client and server it would eventually time out causing the client to issue a new NEXT request.

Clearly this approach has problems. I think the creation of the subscription is fine, but the actual subscription has several problems. The first is that it can't make use of available bandwidth. This problem is endemic to the proxy behaviour and can't be solved without a change to the HTTP protocol that allows multiple responses to a single request rather than this single request/response pair. The second is that no confirmation is given back to the server. A response may be sent down the TCP/IP conection that the NEXT request arrived on but never be transmited to the client due to a connection being closed. This can be solved by adding a URI to both the NEXT request and the two responses available for the NEXT and SUBSCRIBE requests. As an alternative, NEXT requests may have to be directed to the URI(URL) of the next value rather than being sent to the subscription. Responses would have to specify a URI that should be used in the next NEXT request passed to the subscription. If the URI matches what is currently available the server should return the data immediately with a new URI. If the URI doesn't matches an older state the server should return the state, but also indicate how many updates were missed (if possible) back to the client. If the URI is still a future URI (the next URI) the response should be deferred.

The proxies get in the way of a decent solution. Really, the only solution is to come up with a new protocol (perhaps a special extension of HTTP) or use a different existing protocol.

XMPP Subscription

At least one person I know likes to talk about Jabber whenever publish-subscribe comes up. Here is the standard defined for XMPP. The thing that immediately gets the hairs at the back of my neck going is that Jabber doesn't seem to immediately support REST concepts. It's a message passing system where the conceptual framework relies on you connecting to someone who knows more than you about how to find things on the network. That doesn't seem right to me. I prefer the concept of caches that mirror network topology to the idea of connecting to some server three continents away that might be arbitrarily connected to some other servers but most probably is not connected to everything, just a subset of the Internet. My thinking also leads me to think of publish-subscribe as an intrinsic part of a client-server relationship rather than this thing that you slap onto the top of either an established HTTP protocol or an established XMPP protocol.

Those things said, oh gosh it's complicated. I really think that you don't need much over existing HTTP to facilitate every possible collaboration technique, but as you enter the XMPP world you are immediately hit with complicated message types combined with HTTP return codes combined with more complicated return nodes. The resource namespaces don't seem well organised. I'm not sure I quite understand what the node "generic/pgm-mp3-player" refers to in the JEP's example. It's all peer to peer rather than client server and... well... I'm sorry that I can't say I'm a fan. Maybe once it's proven itself a little more I'll give XMPP another look.

Conclusion

I've already suggested some more radical approaches to adding subscription support to HTTP. I do believe it's a first class problem of an internet scale protcol and should be treated as one. I think that making appropriate use of available bandwidth is an important goal and constraint. Unfortunately, I believe that working with the existing Internet infrastructure is also important. At the moment proxies make this a hard problem to solve well. In the interim, feel free to try out my "NEXT" HTTP subscription protocol and see how you like it. It may at least open things up to the second tier of users.

Benjamin

Sat, 2005-Jul-23

Generic Event Notification Architecture

I was recently asked my opinion of the Generic Event Notification Architecutre (GENA). It is a subscription protocol that uses HTTP as its transport. A client makes a subscribe request to a URI, and the server is responsible for returning notifications via separate HTTP requests back to the client. The protocol was submitted by Microsoft to the IETF as a draft in September 2000, and it is a little unclear as to whether it has seen any sort of comitted adoption. It may be that it has since been superceeded in the minds of Microsoft employees by SOAP-based protocols.

GENA uses a HTTP SUBSCRIBE verb to make request of the server. The request is submitted to a specific URI which represents the subscribe access point. The subscription must be periodically confimed with an additional subscription request. One of the HTTP headers in the original SUBSCRIBE response carries what is known as a Subscription ID, or SID. The same SID header must be included in the additional SUBSCRIBE requests. Each subscription can specify the kinds of event notifications this client is interested in receiving, associated with the original resource it subscribed to. SUBSCRIBE requests include the URI that the server should NOTIFY when the event appears.

I have qualms generally about subscription models that require the server to connect back to the client. This confuses matters signficantly when firewalls are involved, but on the purely philiophical level it makes what is fundamentally a client-server relationship into one of two peers. I'll get back to that concern, but I think there are other aspects of the protocol that could do with some fine tuning as well.

The protocol is almost RESTful. It allows different things to be subscribed to by specifying different resources. It allows n-layered arbitration between the origin server and clients, just like HTTP's caching permits. It gets confused, though, and I think the SID is a prime example of this. The SID identifies a subscription, but instead of being a URI it is an opaque string that must be returned to the original SUBSCRIBE URI. If I were writing the protocol I would turn this around and clearly separate these two resources. You have a resource that acts as a factory for subscriptions and is the thing you want to subscribe to, and you have a subscription resource. I would suggest that the subscription resource be a complete URI that is returned in a Location header to match the effect of POST. It might even be reasonable to use the POST verb rather than a SUBSCRIBE verb for the purpose.

Once the subscription resource is created, it should be able to be queried to determine its outstanding lifetime. A 404 could be returned should the lifetime have been exceeded, and a PUT could be used to refresh the lifetime or even alter the set of events to be returned. From the protocol's perspective, though it is probably simplest just to define the effect of a SUBSCRIBE operation on the subscription in refreshing the timeout and leave the rest to best practice or a later draft.

Returning to the issues of how updates are propagated back to clients, I've harped on before about how I believe this needs to be a change to the HTTP protocol rather than just an overlay. I believe that a single request needs to be able to have multiple responses associated with it that will arrive in the order they were sent down the same TCP/IP connection as the request was made on. Dropping the connection drops all associated subscriptions just as it aborts responses to any outsanding requests. I agree that this approach may not suit loosely-coupled subscribe scenarios that don't want the overhead of one TCP/IP connection for each client/server relationship, but the GENA authors appear to also have been thinking along these lines. The draft includes the following:

We need to add a "connect and flood" mechanism such that if you connect to a certain TCP port you will get events. There is no subscribe/unsubscribe. We also need to discuss this feature for multicasting. If you cut the connection then you won't get any more events.

To turn specific focus back on GENA, I think that the HTTP callback mechanism is still underspecified. In particular it isn't clear what the responsibilities of the server are in returning responses. The server could use HTTP pipelining to deliver a sequence of notifications down the same TCP/IP connection, but what should it do when the connection blocks? The server could try to make concurrent connections when multiple notifications need to be sent, but which will arrive first? Will out of order notifications cause the client to perform incorrect processing? Can the client assume that the latest noficication represents the current state of the resource? Infinite buffering of events is certianly not an option, so what do you do when you exeed your buffer size? Do you utilise your bandwith via pipelining or do you limit your notification rate to the network latency by waiting for the last response before sending another? I don't see any mention in the protocol of an "Updates-Missed" header that might indicate to the client that buffering capabilities had been exceeded.

The specification also allows the server to silently drop subscriptions, a point of which clients may be unaware until it comes time to refresh the subscription. For this to work in practice the cases under which subscriptions could be dropped without notification would have to be well understood.

The actual content being delivered by GENA is unspecified, but GENA does include mechanisms for specifying event types. Personally, I think that the set of resources should be included in the definition of the subscribe URI rather than a special "NT" or "NTS" header. I think it's more RESTful to create separate resources for these separate things you might want to subscribe to than to alias the SUBSCRIBE for a single resource to mean different things depending on header metadata. If we were to take a RESTful view, we would probably want to assume that each update notification's body was a statement of the current representation of the resource. In some cases a kind of difference might also be appropriate. If caching is to be supported in this model the meaning of that content would have to be made as clear as possible, and may have to be explicitly specified in a header just as HTTP's chunked encoding is.

In conclusion, GENA is a good start but could do with some tweaking. I don't know whether the rfc is going anywhere, but if it ever does I think it would be interesting to view and refine it through REST goggles.

Benjamin

Sun, 2005-Jul-17

RDF Content

The intersection between RDF and REST is one I've had difficulty finding. RDF seems great on the surface, but problems crop up as soon as I try and think of anything to use it for. I think after my previous article on the purposes of REST verbs and content types I can finally put a finger on my unease.

When a client requests a document, it does so with a specific purpose and a firm idea of what it wants to do with it. The first step in processing the input is to try and transform it into an internal representation suitable for doing the subsequent work. If there's one thing that XML is good at, it is transformation. If there's one thing that RDF is good at, it's aggregation. I think that the reason RDF is not yet hit its mark is that transformation is a more important function than aggregation when it comes to most machine to machine interactions.

RDF can be expressed in XML, and many people will tell you what's wrong with the current standard and try to offer alternative solutions. Some will complain that it is overly verbose. My beef is simply that there are too many ways to say the same thing, and when you have multiple representations to deal with on your input side your transformation code must become more complicated. It strikes me that most of the document describing the current rdf/xml standard is used up explaining how you can do things many ways to try and reduce verbosity.

So while it is possible to create RDF-compatible XML that is easy to transform it isn't possible to tell someone simply that your service returns rdf/xml of a particular rdf schema and hope that you're making things easier for them. You're much better off giving them RELAX-NG instead.

Despite this flaw, I still think RDF is useful. Despite it currently being harder than it needs to be to transform, it does make aggregation possible. XML doesn't support that itself at all. So like the previous article's "we use verbs so that caching works" theme, today's theme will be "we use RDF-compatible XML as our content type so that aggregators work". Aggregators are intermediatories like caches or databases that have to hang onto data and its meaning on behalf of clients who might come along later. Even this could have its problems. A pure RDF aggregator would require client software to still be quite complex in order to process (transform) the returned RDF. I suspect that specialised aggregators like those for rss and atom will be more fundamentally useful in the short term. The solution seems to be to improve the transformabilty of RDF generally, although I don't have a fundamentally good answer as to how. The use of rules engines like jena may have some impact.

Of course, this is only about how RDF intersects with REST. I think RDF is proving itself mightily in ther RDBMS sphere. Both client and server applications can back themselves with RDF triple stores that support ad hoc data insertion and query. No central authority has to be in charge of the schema, and this responsibility can be distributed amongst different groups. In fact, I would say that trying to design a database technology these days without considering RDF would be a bit of a waste of time. All the good sql-only databases have been written already, and most of these have also seen the RDF light.

Benjamin

Sat, 2005-Jul-16

File Tagging

Watching my mother trying to use Windows XP to locate her holiday snaps makes it clear to me that tagging is the right way to interact with personal documents. The traditional "one file, one location" filesystem is old and busted. The scenareo begins with my mother learning how to take pictures from her camera and put them into foldlers. Unfortunately, my father is still the one managing short movie files. The two users have different mental models for the data. They have different filing systems. Mum wants to find files by date, or by major event. Data thinks that movie files are different to static images and that they should end up in different places. The net result is that Mum needs to learn how to use the search feature in order to find her file, and is lucky to find what she is looking for.

Using tags we would have a largely unstructured collection of files. The operating system would be able to apply tags associated with type automatically, so "mpeg" and "video" might already appear. The operating system might even add tags for time and date. The user might add additional tags such as "21st Birthday" or "Yeppoon Trip". Tags are associated with the files themselves and can be added from anywhere you see the file. You'll then be able to find the file via the additional tag. This approach seems to work better than searching or querying does for non-expert computer users. A query has to be constructed from ideas that aren't in front of the user. Tags are already laid out before them.

Here is one attempt at achieving a tagging model in a UNIX filesystem. I'm not absolutely sure that soft links are the answer. Personally I wonder if we need better hard links. If you delete a hard link to a file from a particular tag set it would disappear from there but stay connected to the filesystem by other links to it. It shouldn't be possible to make these links invalid like it is with symbolic links. Unfortunately hard links can't be used across different filesystems. It would be nice if the operating system itself could manage a half-way point. I understand this would be tricky with the simplest implementation resulting in a copy of the file on each partition. Deciding which was the correct one when they dropped out of sync would be harmful. Perhaps tagging should simply always happen within a single filesystem. Hard links do still have the problem that a rename on one instance of the file doesn't trigger a rename to other tagged instances.

Benjamin

Sat, 2005-Jul-16

The Visual Display Unit is not the User Interface

One thing I've noticed as I've gotten into user interface design concepts is that the best user interface is usually not something you see on your computer screen. When you look at an iPod it is clear how to use it, and what it will do. A well designed mobile phone such as the Nokia 6230 my wife carries makes it easy to both make phone calls and to navigate its menus for more sophisticated operations. A gaming console like the PS2, the Xbox, or the Game boy is easy to use. Easer than a PC, by miles.

Joel Spolsky has a wonderful work online descibing how user interfaces should be designed. He designates a whole chapter to "Affordances and Metaphors", which very simply amounts to giving the user hints on when and how to click. In the same document he highlights the problem that makes desktop software so hard to work with generally:

Users can't control the mouse very well.

I've noticed that whenever I'm finding a device easy to use, it is because it has a separate physical control for everything I want to do. Up and down are different buttons, or are different ends of a tactile device. I don't have to imagine that the thing in the screen is a button. Instead, the thing in the screen obviously relates to a real button. I just press it with my stubby fingers and it works.

So maybe we should be thinking just as much about what hardware we might want to give to users as we think about how to make our software work with the hardware they have already. Wouldn't it be nice if instead of a workspace switcher you had four buttons on your keyboard that would switch for you? Wouldn't it be nice if instead of icons on a panel, minimised applications appeared somewhere obvious on your keyboard so you could press a button and make the applications appear? There wouln't be any need for an expose feature. The user would always be able to see what was still running but not visible.

Leon Brooks points to a fascinating keyboard device that includes LCD on each button. This can be controlled by the program with current focus to provide real buttons for what would normally only by "visual buttons". I think this could make the app with focus much more usable both by freeing up screen real estate for things the user actually wants to see and making buttons tangable instead of just hoping they look like something clickable. Personally I would have doubts about the durability of a keyboard like this, but if it could be made without a huge expense and programs could be designed to work with it effectively I think it could take off.

We can see this tactile approach working already with scrollwheel mice and keyboards. It is possible to get a tactile scroll bar onto either device without harming its utility and while making things much simpler to interact with. Ideally, good use of a keyboard with a wider range of functions would remove entirely the need for popup menus and of buttons on the screen. In a way this harks back to keyboard templates like those for wordperfect. I wonder if now the excitement over GUIs and mice has died down that this approach will turn out to be practical after all.

Update 18 October 2005:
United Keys has a keyboard model that is somewhat less radical in design and looks to be a little closer to market. Thanks to commenter Tobi on the Dutch site Usabilityweb for pointing it out. I like the colour LCD promised in the Optimus keyboard, but suspect that the united keys approach of segregating regular typing keys from programmable function keys will wear better. Ultimately it has to both look good and be a reasonable value proposition to attract users.

Benjamin

Sat, 2005-Jul-16

Solaris C++ Internationalisation

One of my collegues has been tasked with introducing infrastructure to internationalise some of our commercial software. We are currently running Solaris 9 and using the Forte 8 C++ compiler. We decided that the best way to perform the internationalisation would be to use a combination of boost::format and something gettext-like that we put together ourselves. That's where the trouble started.

The first problem was the compiler itself. It can't compile boost::format due to some usage of template metaprogramming techniques that were slightly beyond its reach. It quickly became clear that upgrading to version 10 of the compiler would be necessary, and even then patches are required to build the rest of boost.

That was problem number one, but the hard problem turned out to be in our use of the -libary=stlport4 option. Stlport appears not to support locales under Solaris We've been tracking Forte versions since the pre-standardisation 4.2 compiler, and that's just while I've been working there. We originally used stlport because there was no alternative, but when we did upgrade to a compiler with a (roguewave) STL we found problems changing over to it. When got things building and our applications were fully loaded up with data we found they used twice as much memory as the stlport version. At the time we didn't have an opportunity to upgrade our hardware so that kind of change in memory profile would have really hurt us. With no impetus for change we decided to stick to the tried and true.

By the time Forte hit version 8 it had the -library=stlport4 option to use an inbuilt copy of the software and we stopped using our own controlled version. We found at the time that a number of STL-related problems being reported through sunsolve were being written off with "just use stlport" so weren't keen to try the default STL again. These days it looks like this inbuilt STL hasn't been modified for some years. It does support non-C locales, but moving our software over is a new world of pain.

Another alternative was to use gcc. Shockingly, the 3.4.2 version available from sunfreeware produced incorrect code for us when compiled with -O2 for sparc. This also occured in the latest 3.4.4 version shipped by blastwave. I haven't looked into the problem personally to ensure it isn't something we're doing, but the people who did look into it know what they're doing. Funnily, although the sfw 3.4.2 version did support the full range of locales, blastwave's 3.4.4 did not. We would have been back to square one again.

So, the summary is this: If you want do internationalise C++ code under Solaris today you have very few good choices. You can run gcc, which seems to have some dodgy optimisation code for sparc... but make sure you get it from the right place or it won't work. You can use Forte 10, but you can't use the superior stlport for your standard library. C++ is essentially a dead language these days, so don't count on the situation improving. My guidance would be to drop sparc as quickly as you can, and use gcc on an intel platform where it should be producing consisently good code.

Benjamin

Sat, 2005-Jul-16

REST Content Types

So we have our REST triangle of nouns, verbs, and content types. REST is tipping us towards placing site to site and object to object variation in our nouns. Verbs and content types should be "standard", which means that they shouldn't vary needlessly but that we can support some reasonable levels of variation.

Verbs

If it were only the client and server involved in any exchange, REST verbs could be whittled down to a single "DoIt" operation. Differences between GET, PUT, POST, DELETE, COPY, LOCK or any of the verbs which HTTP in its various forms supports today could be managed in the noun-space instead of the verb space. After all, it's just as easy to create a https://example.com/object/resource/GET resource as it is to create https://example.com/object/resource with a GET verb on it. The server implementation is not going to be overly complicated by either implementation. Likewise, it should be just as easy to supply two hyperlinks to the client as it is to provide a single hyperlink with two verbs. Current HTTP "A" tags are unable to specify which verb to use in a transaction with the href resource. That has lead to tool providers misusing the GET verb to perform user actions. Instead of creating a whole html form, they supply a simple hyperlink. This of course breaks the web, but why is not as straightforward as you may think.

Verbs vs Delegates

Delegates in C# and functions in python give away how useful a single "doIt" verb approach is. In a typical O-O observer pattern you need the observer to inherit from or otherwise match the specification available for a baseclass. When the subject of the pattern changes it looks through its list of observer objects and calls the same function on each one. It quickly becomes clear when we use this pattern that the one function may have to deal with several different scenareos. One observer may be watching several subjects, and it may be important to disambiguate between them. It may be important to name the function in a more observer-centric rather than subject-centric way. Rather than just "changed", the observer might want to call the method "openPopupWindow". Java tries to support this flexibility by making it easy to create inner classes which themselves inherit from Observer and call back your "real" object with the most appropriate function. C# and python don't bother with any of the baseclass nonsense (and the number of keystrokes required to implement them) and supply delegates and callable objects instead. Although Java's way allows for multiple verbs to be associated with each inner object, delegates are more "fun" to work with. Delegates are effectively hyperlinks provided by the observer to the subject that should be followed on change, issuing a "doIt" call on the observer object. Because we're now hyperlinking rather than trying to conceptualise a type hierarchy things turn out to be both simpler and more flexible.

The purpose of verbs

So if not for the server's benefit, and not for the client's benefit, why do we have all of these verbs? The answer for the web of today is caching, but the reasoning can be applied to any intermediatary. When a user does a GET, the cache saves its result away. Other verbs either mark that cache entry dirty or may update the entry in some ways. The cache is a third party to the conversation and should not be required to understand it in too much detail, so we expose the facets of the conversation that are important to the cache as verbs. This principle could apply any time we have a third party involved who's role is to manage the communication efficiently rather than to become involved in it directly.

Server Naivety and Client Omniscience

In a client/server relationship the server can be as naive as it likes. So long as it maintains the basic service contstraints it is designed for, it doesn't care whether operations succeed or fail. It isn't responsible for making the system work. Clients are the ones who do that. Clients follow hyperlinks to their servers, and they do so for a reason. Whenever a client makes a request it already knows the effect its operation should have and what it plans to do with the returned content. To the extent necessary to do its job, the client already knows what kind of document will be returned to it.

A web browser doesn't know which content type it will receive. It may be HTML, or some form of XML, or a JPEG image. It could be anything within reason, and within reason is a precisely definable term in this context. The web browser expects a document that can be presented to its user in a human-readable form, and one that corresponds to one of the standard content types it supports for this purpose. If we take this view of how data is handled and transfer it into a financial setting where only machines are involved, it might read like this: "An account reconciler doesn't know which content type it will receive. It may be ebXML, or some form from of OFX, or an XBRL report. It could be anything with reason, and within reason is a precisely definable term in this content. The reconciler expects a document that can be used to compare its own records to that of a supplier or customer and highlight any discrepencies. The document's content type must correspond to one of the standard content types it supports for this purpose."

REST allows for variations in content type, so long as the client understands how to extract the data out of the returned document and transform it into its own internal representation. Each form must carry sufficient information to construct this representation, or it is not useful for the task and the client must report an error. Different clients may have different internal representations, and the content types must reflect those differences. HTTP supports negotiation of content types to allow for clients with differing supported sets, but when new content types are required to handle different internal data models it is typically time to introduce a new noun as well.

Hyperlinking

So how does the client become this all knowing entity it must be in every transaction it participates? Firstly, it must be configured with a set of starting points or allow them to be entered at runtime. In some applications this may be completely sufficient, and the configuration of the client could refer to all URIs it will ever have to deal with. If that is not the case, it must use its configured and entered URIs to learn more about the world.

The HTML case is simple because its use cases are simple. It has two basic forms of hyperlink: the "A" and the "IMG" tags. When it comes across an "A" it knows that whenever that hyperlink is activated it should look for a whole document to present to its user in a human-readable form. It should replace any current document on the screen. When it comes across "IMG" it knows to go looking for something humean-readable (probably an actual image) and embed it into the content of the document it is currently rendering. It doesn't have to be any more intelligent than that, because that is all the web browser needs to know to get its job done.

More sophisicated processes require more sophisticated hyperlinks. If they're not configured into the program, it must learn about them. You could look at this from one of two perspectives. Either you are extending the configuration of your client by telling it where to look to find further information, or the configuration itself is just another link document. Hyperlinks may be picked up indirectly as well, as the result of POST operations which return "303 See Other". As the omniscient client they must already know what to do when they see this response, just as a web browser knows to chase down that Location: URI and present its content to the user.

There is a danger in all things of introducing needless complexity. We can create new content types until we're blue in the face, but when it comes down to it we need to understand the clients requirements and internal data models. We must convey as much information as our clients require, and have some faith that they know enough to handle their end of the request processing. It's important not to over-explain things, or include a lot of redundant information. The same goes for types of hyperlinks. It may be possible to reduce the complexity of documents that describe relationships between resources by assuming that clients already know what kind of relationship they're looking for and what they can infer from a relationships existence. I think we'll continue to find as we have found in recent times that untyped lists are most of what you want from a linking document, and that using RDF's ability to create new and arbitrary predicates is often overkill. My guide for deciding how much information to include is to think about those men in the middle who are neither client nor server. Think about which ones you'll support and how much they need to know. Don't dumb it down the for the sake of anyone else. Server doesn't care, and Client already knows.

Benjamin

Mon, 2005-Jul-11

Object-Orientation Without Baseclasses

I've been working on the impedence mismatch between REST and Object-Orientation. It's a thorny issue, and I've come across several RESTafarians who believe the concepts don't mesh at all. I think that some approaches taken so far have tried too hard to make a resource into an object or an object into a resource. My approach is to match O-O and RESTful via a less drastic (or more, depending on your point of view) overhall. I want change an object's model of abstraction from a set of functions into a set of resources. I want to drop the concepts of base classes and class hierarchies altogether. I want to talk about aspects of an object rather than its whole type or its constituent functions.

Queues and Stacks

Let's leave aside the network-oriented views of how REST works for now. Let's just think of a client object and a server object. To interact with an O-O queue, a client object might be presented with several functions:

You could also add other useful functions like checking the size of the queue or examining entries other than the first. You can do the same sort of thing with a REST queue. You just need to think in terms of resources instead of functions:

A POST to the insertion point resource creates a new queue entry. A GET to the beginning allows the client to examine its content. A DELETE to the beginning (or is that a POST, too?) clears the current value and replaces it with a new one. You might even be able to get away with a single resource representing both an insertion and an extraction point. Either way, with a minimum of verbs and a maximum of nameable resources it is possible to achieve the same kinds of abstractions as we're used to in O-O. It would be simple in either model to change the implementation to a stack while maintaining the same set of functions and type or of resources for the use of client software.

Representations, not Types

REST pushes the possibilities a little further forward by using only representations of objects rather than objects themselves as parameters to its verbs. This sets the type barrier that we often see in O-O much lower, so a representation on disk can be applied to an object in memory or be used to create a new object. Objects of different types but compatible representations can be applied to each other. Even incompatible objects which share only small aspects of their representations with other objects can be squeezed together so long as they provide that aspect of themselves as a resource. It is conceivable that both the stack and queue implementations share the same representation, making it possible to copy one into the other without the need for explicit conversion.

Modelling in an O-O language, and Object-Orientation Without Baseclasses

My picture for linking the O-O and REST world views within an application calls for the concept of a resource type. This type can be implemented explicitly as an interface or baseclass or might be implemented implicitly depending on your language of choice. It is important that this single definition capture all verbs that you might want to apply to resources. A simple implementation of this type would accept function pointers, or delegates, or callable objects, or whatever your language supports to map directly back into the O-O paradigm. For example, an object representing an XML document might be composed of one explicit resource and one family of resources. The main one is a representation of the whole XML document which can be PUT, and GET to with the usual outcomes. The second I call a family, because it is infinitely variable via a query. It represents an xpath-selectable part of the document which can also be PUT and GET to, as well as POSTed to and DELETEd. These would map to one function per verb per resource on my XML object. Verbs on the main resource just become PUT() and GET(). Verbs on anything in the xpath family become xpathPUT(query), xpathGET(query), xpathPOST(query), and xpathDELETE(query).

When dealing with the XML document object, you could pass the resource representing the whole object around and have code that can produce XML replace the document with a new one. If you passed around resources representing only a specific xpath-selectable part of the document it would only be that part that was replaced. If your xpath selected a text or attribute node within the XML you could pass that resource to anything that produces a string and have the node value replaced or used in further processing. In the end, you have a single self-consistent object that can expose any aspect of itself it chooses as a resource to be operated on by anything it likes.

When you've had type as a constant friend and enemy for so many years the thought of minimising it in this way can be daunting, but python took the first stab at this approach. It did so by dropping the type modelling for variables and parameters but keeping it for objects themselves. It is still possible to see TypeError exceptions raised in Python because you passed in a DOM attribute node instead of a string, yet the two are equivalent for GETs and mutable variants of string are equivalent for most purposes. By dropping the type barrier further I think we'll find that for the most part we don't really need type as often as we think we do.

URI-space

So as we remove and reduce the use of exotic verbs and types from the way objects interact with each other we also need to keep in mind the nouns side of things. It's ok to pass resource pointers around within a program until the first time you want to use the result of a resources GET operation as a resource itself. That's when you need a URI space and a means of creating or finding resource objects via the space.

The simplest approach might still be to use your language's object identifers. Python has its id() function. C++ has its pointers. Java no doubt has the same capabilities, as I'm sure I've seen them in the debugger often enough. object:0xfff3673 might be just the ticket to locate your resource quickly. On the other hand you might want to have names that are able to survive serialisation and process restarts. Whichever way you go, you'll also have to deal with that pesky query case that my XML object above makes use of.

So we've identified the resource objects. We also need a resolver object to find resources. To deal with queries, you need an additional object that will return you a resource object to use when provided a query. With a URI that looks like this: "object:path?query" path will resolve to the resource finder object. When you look up the found resource it will most likely call a function on the object that owns this resource that looks something like: object.GET(query). If path turns out to be hierarchical you may also wish to have objects or a class that represent levels within the hierarchy. In my current python prototype I'm using the objects themselves as steps along the path. A few well-named top-level objects are added to the resolver's namespace and python allows easy navigation down the object hierarchy.

What's left?

Well, obviously you need to find the right fit for the kind of language you're using today to make use of any new approach. For myself I'm reasonably comfortable with applying the model generally to Python, C#, Java, C++, and C. I'm more or less comforable with everything except the POST and DELETE verbs. These verbs deal with resource creation and destruction, but because an object is composed of several resources such actions are bound to result in side-effects on other resources. In the XML example I've been using, it would be simple to POST to a factory resource in order to create our XML file. Once we've done that and the POST had returned us the address of the new main URI, how do we find the xpath URI? Several approaches are possible including,

The latter suggestions seem decreasingly restful, but the earlier ones require further thought about how to implement correctly. I'm a little uncomfortable generally thinking about verbs that alter the lifetime of URI, especially after all of the popular O-O languages on earth have just managed to shrug most explicit lifetime managment out of their objects. Refinement of the meaning clients can extract from these and other verbs will be important going forwards. Coming up with new design patterns and examples of how to model certain types of objects using resources will be equally important.

I see significant potential in the overall REST approach for objects. I think that it could suppliment, or even replace O-O design in many areas and in the longer term. In the mean time there is a lot of growing and development to do. I think that seeing it work inside every-day apps could accelerate development of the underlying ideas and improve adoption when compared to the "it only works on the web" mantra. I think that if it really only works on the web then there is something fundamentally broken about it. If it only works on the web, we should be talking about the more Object-Oriented WS-* for web development instead. We at least know that paradigm works somewhere.

Benjamin

Fri, 2005-Jul-08

REST in an Object-Oriented Language

I have received interesting feedback on my last article from a number of sources. Sometimes it is just as interesting to see responses to what I say and what readers hear rather than what I mean. A Mr "FuManChu" took me to task over my treatement of POST and I responded with a comment to his blog. Mike Dierken is concerned about my advice to reduce the number of content types in play to be more RESTful. He writes:

REST actually drives toward increasing content-types. Not necessarily maximizing, but definitely opening up the space of possibilities. So I would say that REST seeks to push the balance away from verbs and as close to the edge of nouns and content-types.

I would respond firstly with this quote from Roy Fielding:

REST enables intermediate processing by constraining messages to be self-descriptive: interaction is stateless between requests, standard methods and media types are used to indicate semantics and exchange information, and responses explicitly indicate cacheability.

I will go on to fill out what I think is a simple difference in terms shortly.

Mike does make the excellent point that

REST does not have a heirarchical namespace. Neither does HTTP. They only have identifiers that are opaque to the client.

Even though the path space in URIs are explicitly hierarchical with relative links forming an integratal part of the specification, REST is based on hyperlinks. We should not be constructing one URI from another. We should be following hyperlinks instead. This was a weakness in my article first pointed out to me by Ken MacLeod via email.

Mark Baker says, and Henry Story echoes:

Nope, properties are properties, resources are the Beans themselves. Imagine a "Person" Bean which might expose the properties; name, birth date, birth city. Now imagine that Bean as a resource, with its own http URI; invoke GET on it and you receive, say, an XML document

You are right that the person may be a resource in their own right. I would say simply that if you are doing GET and PUT operations on the person object it is a resource. The reason I say that a bean's properties are resources (possibly as well) is that they do support GET (get) and PUT (set) operations in their own right. Depending on your application it may be appropriate to expose just the bean or just its properties as resources. It may be appropriate to expose both or neither. My contention is simply that uniform GET and PUT operations applied to any named entity makes that entity a resource, or at worst a pseudo-resource. They highlight the natural tension that already exists in software between the O-O and REST world views.

Several other people have also commented using their blogs. Thanks to technorati at least some of links have been easy for me to find. Please keep the feedback coming, and I recommend technorati membership to anyone who's blog is not currently being indexed by those kind folk. I've picked up several responses only through my site statistics, and many of these I'll have missed as I can only see the last twenty under my current hosting arrangements.

Content Types

I used the term "content type" deliberately in my previous article to evoke the content types identified in the http header. The most common one is surely text/html. The ubiquitous nature of html has been a major component of the success of the web. Instead of having each web site offer content in its own proprietary format, we could rely on web sites that wanted to serve hypermedia using something approximating html. While there are many variants as content providers have asked more of the format, the basics have remained the same and we've been able to settle on something that at worst displeases everyone equally.

When html is not well-formed, or is not completely understandable it may not render correctly. The human viewer can probably still make out everything they need to, and can drop back to the source if they get particularly desperate. The application/xml content type doesn't have such luxuries. As Sean McGrath noted recently in his blog:

XML is not - repeat NOT - a 'file format'

While it is not a file format, it is a content type. It seems that just when you can't stand to send or receive anything at all out of the ordinary, we get lax about what the actual content is going to be.

I think that in the end it doesn't matter, for the same reason that nouns are more important than content types. I think that type is the least of your worries when a client makes a request of a server. The server could reject the request for a huge variety of reasons. Perhaps a number that forms part of the content is greater than ten. Perhaps only chinese text is accepted. The semantics of what a particular URI accepts with its PUT verb are entirely up to the URI owner, and can only be construed from that name of that specific resource. Making sure you send XML instead of ASN.1 or that your XML matches a specific schema is the least of your worries in executing a successful transaction with the resource.

I spoke last time about pushing the balance of the (noun, verb, content type) triangle away from both verb and content type. Again, this is from the perspective of Object-Orientation's typing system where it is necessary to define conversion operations, the creation of new types requires the matching creation of new functions, and if you don't have an exact match things just won't work and may not even compile. Given that knowledge of content type is not sufficient to ensure acceptance by the resource you're talking to, and that it is always the named resource that performs the test for accpetance, I think that content types are probably irrelevant when it comes to mapping REST into an Object-Oriented environment. "Format" is more important. In my view, when something is irrelevant to the correctness of your program the bar should be set as low as possible.

Mapping REST into an Object-Oriented Environment

As I've stated before, I believe if REST actually works it will work in a regular (non-web) programming environment. Further, I think that if it fails outside the web it has a good chance of failing on the web itself for application development.

So, let's set a few rules for ourselves. What restrictions do we need to put on ourselves to avoid fooling ourselves that we're being restful?

  1. All object APIs should be expressed as resources
  2. All resource access should be via a consistent noun concept
  3. No exotic verbs (functions) should be allowed on resources
  4. Navigating through the resource space except via hyperlinks is illegal

Exceptions can be made for objects involved in the mechanics of the RESTful object system, and when data is not in motion (it is in some object's member variable) it can be in whatever form the object desires.

Nouns

So, what should our nouns look like? Our first thought should be that URIs are like pointers or references. We could make those our nouns. Each pointer could refer to an object of type "Resource" with the appropriate verbs on its surface. This model suffers, somewhat, when we think about exactly what we're going to use and return in the PUT, GET, and POST functions.

The second iteration for me is the KISS approach. Let everything be a string, from noun to content type. After all, a noun could and should be used as content from time to time. There is no reason to make artificial distinctions between the two concepts. In my prototype I've been using a scheme called "object" to refer to object-provided resources:

The resolver of object URIs starts with a set of objects with regestered names at the top level. To assist in the mapping onto a pure O-O model it navigates down the object's hierarchy of properties until it meets one that is prepared to perform the standard verb operations for the remainder of the unconsumed part of the URI path or until it reaches the end of its path. If no handler is found, it handles the verbs on the end node itself. This approach allows any object along a path to take over the handling of the rest of the URI text. In the example above, and object of type XMLResource handles its own path. It looks for an xpath entry immediately below it in the uri and parses any query part as an xpath expression. This kind of thing is profoundly useful. It can be a private hell to actually try and create one resource object for each actual resource.

Verbs

I define the following global operations:

These map to operations on objects of:

Errors are currently reported via exceptions. All operations are currently synchronous, including access to remote services such as http.

Content Type

There is probably a performance hit associated with the "all string, all the time" approach to content types. Especially in cases where you have a complex data structure behind one resource and the same kind of structure behand another it may be a useful shortcut to avoid the conversion to string. I think this should be doable with a minimum of code. Caching of transformations may also be of use if performance becomes a problem. I think that overall the advantages of a REST approach should make these problems irrelevant, but you may not want to code any tight loops with GET and PUT operations inside them any time soon :)

Variables and parameters

Variables in a RESTful application become aliases for content or aliases for URIs. They aren't evil, but they don't work like O-O variables and parameters to today. You can't operate on them except by passing them to other functions, or using them in the set of standard verb operations. Instead of passing an object to a function, you'll end up passing in specific sub-resources of the object. Perhaps you'll even pass in structures that form collections of resources, to help things swing back towards the O-O way of doing things which has served us so well over the years. The main thing will be a switch away from the view of a type being a single monolith, and instead seeing the individual resources or aspects of that type as separate entities from the function's point of view.

Event Handling

I'm currently working with an event handling model based on a GET to one resource followed by a PUT, POST, or DELETE to another. Given the REST approach this technique can actually be quite powerful. It is possible to cause a button press to copy a sample piece of XML into a specific (xpath-located) part of an XML resource. It can copy the entire resource to or from the web or a file. It can update its attributes and text nodes with input from the user. It effectively allows you basic XForms capability. I think it is even simple enough to build IDE and GUI builder tools to create these handlers.

You can interact with other kinds of objects as well. It is possible to express a queue or stack as resources. You can POST to the top of a stack to create new entries. DELETE as necessary to remove them. You can interact with a wide variety of data structures and objects. You can copy that XML document onto an object which accepts an XML representation of that specific format to make make it assume a particular state. From a declarative and very simple world view you can trigger effects in your application that are extremely powerful. The power doesn't come from the simple event handler. Instead it comes from exposing application functionality in a RESTful way. The uniformity of this interface makes it possible for software that was never designed for interoperatbility to interact... and without a base-class in sight.

Benjamin

Tue, 2005-Jul-05

REST versus Object-Orientation (and a little python)

Initial revision: 4 July 2005.
Edit: 5 July 2005. Since I wrote this somewhere between the hours of three and five in the morning I've decided to exercise my right to update its content slightly. I've added a few more points missed in the first publication. Thanks to Ken MacLeod for pointing out weakness in the original version's discussion of global namespaces. Actually, on second reading I've misconstrued his email slightly. His contention is that URIs are more like OIDs (pointers, references, or whatever you happen to call them in your langugage of choice). They are things you can dereference to get to the resource you're after, so aren't comparable to the ideas of a global namespace or of local varaibles. This is an important distinction that I haven't fully considered the consequences of, yet.

I think I'm at the stage where I can now compare REST and Object-Orientation from a practitioner's viewpoint. Now, when most people talk about REST they are referring to its use as a best-practice design guide for the web. When I talk about it I'm speaking from the viewpoint of a developer of modular and distributed software. I speak of web services, and typically not of the use of HTML and web browsers. I speak of software design, not web site design.

Similarities

From my perspective, the concepts of Object Orientation (OO) and REST are comparable. They both seek to identify "things" that correspond to something you can about and interact with as a unit. In OO we call it an object. In REST we call it a resource. Both OO and REST allow some form of abstraction. You can replace one object with another of the same class or of the same type without changing how client code interacts with the object. The type is a common abstraction that represents any suitable object equally well. You can replace one resource with another as well, almost arbitrarily so without changing how clients interact with it. Resources are slightly finer-grained concepts than objects, though, so when you talk about resources acting as an abstraction you usually need to talk about the URI space (the resource namespace) being able to stay the same while the code behind it is completely replaced. Clients can interact with the new server software through its resources in the same way as they interacted with the old server software. The resources presented represent both equally well.

Differences

The main difference in my view is that of focus. Objects focus on type as the set of operations you can perform on a particular object. On the other hand REST says that the set of operations you can perform on resources should be almost completely uniform. Instead of defining new operations, REST emphasises the creation and naming of new resources. As well as limiting verbs, REST seeks to reduce the number of content types in play. You can picture REST as a triangle with its three vertices labelled "nouns", "verbs", and "content types". REST seeks to push the balance well away from both verbs and content types, as close as possible to the nouns vertex. Object orientation is balanced somewhere between verbs (functions) and content types (types, and differing parameter lists). I suspect as we understand and exercise the extremes of this triangle over time we'll learn more about where to put the balance for a particular problem space.

In OO object names are always relative to the current object or to the global namespace. We usually see all access restricted to this->something, param->something or SomeSingleton->something, where something is often a verb. It's hard to navigate more deeply than this level because the way Object-Orientation maintains its abstraction is to hide knowledge of these other objects from you. Instead, OO design would normally provide a function for you to call that may refer to the state or call functions on its own child objects.

REST says that the namespace should be king. Every object that should be contactable by another object should have a name. Not just any name, but a globally-accessable one. If you push this to the extreme, every object that should be accessable from another object should be also accessable from any place in the world by a single globally-unique identifier:

in principle, every object that someone might validly want or need to cite should have an unambiguous address.

-- Douglas Engelbart[1]

In his 1991 design document on naming, Berners-Lee wrote[1]:

This is probably the most crucial aspect of design and standardization in an open hypertext system. It concerns the syntax of a name by which a document or part of a document (an anchor) is referenced from anywhere else in the world.

REST provides each abstraction through its heirarchical namespace rather than trying to hide the namespace. Since all accessable objects participate in this single interface, the line between those objects blurs. Object-Orientation is fixed to the concept of one object behind one abstraction, but REST allows us to decouple knowlede even about which object is providing the services we request. You can see the desire to achieve something of this kind via the facade design pattern. REST is focused around achieving a facade pattern; a kind of mega-object.

History

The history I am about to describe is hearsay, and probably reflects more closely how I came to certain concepts rather than how they emerged chronologically. You can track object orientation's history back to best practice for structured programming. In structured programming you think a lot about while loops and for loops. You break a problem down by thinking about the steps involved in executing a solution. In those days it was hard to manage data structures, because it often meant keeping many different parts of the code-base that operated on your data structures in sync. A linked-list implementation often had its insert operation coded several times, so when you changed from a singly-linked list to a doubly-linked one it could be difficult to make sure all of your code still worked correctly. This need for abstraction led to the notion of Abstract Data Types (ADTs).

ADTs were a great hit. By defining the set of legal operations on a data structure and keeping all of the code in once place you could reduce your maintenence costs and manage complexity. The ADT became an abstraction that could represent differnt implementations equally- well. The advantages were so important that the underlying implementation details such as member variables were hidden from client code. Avoiding lapses in programming discipline was a big focus.

Object-Orientation came about when we said "This works so well, why not apply it to other concepts?". Instead of applying the technique just to data structures, we found we could apply it whenever we needed an abstraction. We could apply it to algorithms. To abstract conceptual whosewhatsits. We developed Design Patterns to help explain to each other how to use objects to solve problems that still let us keep our privacy and abstractions.

And REST's history?

REST has an obvious history on the web, where abstraction is a fundamental concept. Resources across the web operate through protocols and other mechanisms that force us to hide the implementation of one object from that of another. I think the seeds of REST in "pure" software are there as well.

Let's take Java beans. The properties of a bean are as follows:

  1. Every java bean class should implement java.io.Serializable interface
  2. It should have non parametric constructor
  3. Its properties should be accessed using get and set methods
  4. It should contain the required event handling methods

I see this as a significant step from an Object-Oriented model towards that of REST. I'll leave aside the implementation of serialisable that allows a representation of the object to be stored, transmitted, and unpacked at the other end. I'll also leave aside the default constructor that must be present to make this sort of thing happen as it should. The real meat in my pie is the use of properties, or as I would call them: Resources.

The use of properties in otherwise object-oriented design paradigms is flourishing. It is much simpler to deal declaritavely with a set of ("real" or otherwise) properties than it is to deal with function calls. Graphical editors find it easier to deal with these objects, and I suspect that humans do as well. By increasing the namespace presence of the object and trimming down the set of operations that can be performed on each precense in that namespace we see that it is easier to deal with them overall. We don't lose any abstraction that we gained by moving to ADTs in the first place because now these properties aren't revealing the internal implementation of our object. They're forming part of the interface to it. When we set these properties or get them, code is still churning away behind the scenes to do whatever the object chooses. The set of properties can represent different types of object equally well.

This is a tipping of the triangle away from verbs and content types towards nouns. I'm sure you can think of other examples. Tim Bray writes, referring to a battle between ASN.1 and XML:

it seems to be more important to know what something is called than what data type it is. This result is not obvious from first principles, and has to count as something of a surprise in the big picture.

Integrated Development Environments

IDEs have recently started to become good enough to use. That's strong praise coming from me, a VI man. Havoc Pennington recently wrote:

The most important language features are the ones that enable a great IDE.

So-called "dynamic" languages such as Python fall short on this mark, because it just ain't possible to for an IDE to examine your program an infer any but the most basic information from it. Python tries to come up with a less formal way of handling type, by essentially saying "if you don't use it, I won't check it". It's still strongly-typed under the covers, though. There are still basic expectations applied to objects you pass in to a function. A string, XML attribute node, and integer aren't all interchangable even if they are parsable to the same number. You just can't know what is expected programatically. Python tips the triangle slightly towards nouns with its intrinsic support for properties, but does not attempt to reduce the number of content types in play.

I think that a more RESTful approach could make things better. By explicitly restricting yourself to dealing with properties rather than function calls, you could create a namespace that represents the entire accessable functionality of your program. As I hinted at earlier, GUI toolkits are alreadying heading the properties way. Once you have as much as possible of your functionality exposed through properties instead of regular function calls, it becomes possible to expose a single namespace that encapsultes this functionality. An IDE could easily work with such a namespace to allow automatic completion, and the thing we all really miss in python: Earlier checking. If you use hyperlinking (complete names rather than constructed names) as much as possible, you can get these hyperlinks checked at construction time if you like. You don't have to wait until you dereference them. To my mind, the simple string should be the main common currency in this world. With garbage collection in play, immutable strings are cheap to pass around and handle. They only need to be processed when they are actually used for something.

How do we make our software more RESTful?

Try exposing your functions as properties or sub-objects instead of functions as you think of them now. Expose a simple set of functions on these properties. Good REST pratice currently says "GET", "PUT", and "DELETE" are most of what you need. You should use "POST" as a "create and PUT" operation. Try giving all such resources globally-accessible names rather than boxing them up. The theory is that other objects will only access them if you hand them a hyperlink (the whole name), so privacy isn't an problem. Use types that are as simple and uniform as possible. I've been trying to get away with just a combination of strings and DOM nodes, although I'm not convinced the latter is a perfect fit.

Any code that accesses this name, verb, and content-type space operate simply using hyperlinks. You may choose to open up access to other URI spaces, such as the http and file schemes. In this way you can hyperlink to and from these spaces without altering the meaning or means of constructing your client code.

To be honest I don't really know whether this will work in the large, yet. I'm trying it out on a program that tries to work as a souped-up glade at present but I have a lot of thinking still to expend on it. I haven't covered the REST approach to event handling in this article, which I think is probably about as hard as it is to describe event handling in any design paradigm. Perhaps another time.

Benjamin

  1. Dan Connolly, "Untangle URIs, URLs, and URNs"

Sun, 2005-Jun-26

How do I comment?

I don't currently support a direct commenting mechnism, but I encourage and welcome feedback. The best way to comment is to respond in your own blog, and to let me know you've done it. If you are a member of technorati I should notice your entry within a few days.

Another alterntaive is email, and when I want to comment on blogs that don't support comments I will usually do both. A recent example is this entry which I first sent as an email to Rusty, then posted the content of my email into the blog. You can see my email address on the left-hand panel of my blog. You'll need to substitute " at " for "@", which I hope is not too onerous.

The main reason I don't accept comments directly is history. I use static web space provided by my ISP and don't have much control over how the data is served and the kind of feedback that can be garnered. I also have a mild fear that comment spam will become a significant adminstration overhead as I have to keep up with the appropriate plugins to avoid too much hassle in this regard. Blogging is an easy and fun way to publish your thoughts, and I hope that a network of links between blogs will be as effective as a slashdot-style commenting system for the amount of interest and feedback my own blog attracts.

Thanks! :)

Benjamin

Sun, 2005-Jun-26

Redefine copying for Intellectual Property!

Rusty recently had an exchange with the Minister for Communications, Information Technology and the Arts. He is worried about the lack of online music stores in Australia, and that the big record companies may be stifling competition in ways that the law has not caught up with. Here is the email I sent to Rusty, reformatted for html.

Hello Rusty,

I wonder if a less radical suggestion from the Minister's point of view would be to try and redefine what copying means for a intellectual property distributor. Instead of "you can't make what I sell you available for download" it could essentially mean "you can't permit any more copies of what I sold you to be downloaded than you actually bought from me":

(Number of copies out (sold, or given away) = Number of copies in (bought)) -> No copying has taken place with respect to copyright law.

If this approach was applied to individuals as well as distribution companies then fair use may not need any further consideration. If there is a feeling that this can't be applied to individuals then the major hurdles, I think, would be:

  1. How do we define an IP distributor, as compared to a consumer of IP?
  2. Who is allowed to define the mechanism for measuring the equation? Is it a statutory mechanism, or do we leave it up to individual contracts?

Just a happy sunday morning muse :)
I'm sure you've already thought along these lines before.

Mon, 2005-Jun-20

A RESTful non-web User Interface Model

I've spent most of today hacking python. The target has been to exploit the possiblity of REST in a non-web programming envioronment. I chose to target a glade replacement built from REST principles.

Glade is a useful bare bones tool for constructing widget trees that can be embedded into software as generated source code or read on the fly by libglade. Python interfaces are available for libglade and gtk, but there are a few things I wanted to resolve and improve.

  1. The event handling model, and
  2. The XML schema

The XML schema is bleugh. Awfully verbose and not much use. It is full of elements like "widget" and "child" and "parameter" that don't mean anything, leaving the real semantic stuff as magic constant values in various attribute fields. In my prototype I've collapsed each parameter into an attribute of its parent, and made all elements either widgets or event handling objects.

The real focus of my attention, though, was the event handling. Currently gtk widgets can emit certain notifications, and it is possible to register with the objects to have your own functions called when the associated event goes off. Because a function call is the only way to interact with these signals, you immediately have to start writing code for potentially trivial applications. I wanted to reduce the complexity of this first line of event handling so much that it could be included into the glade file instead. I wanted to reduce the complexity to a set of GET, and PUT operations.

I've defined my only event handler for the prototype application with four fields:

To make this URL-heavy approach to interface design useful, I put in place a few schemes for internal use in the application. The "literal:" scheme just returns the percent-decoded data of its path component. It can't be PUT to. The "object:" scheme is handled by a resolver that allows objects to register with it by name. Once a parent has been identified, attributes of the parent can be burrowed into by adding path segments.

The prototype example was a simple one. I create a gtk window with a label in it. The label initially reads "Hello, world". On a mouse click, it becomes "Goodbye, world". It's not going to win any awards for oringinality, but it demonstrates a program that can be entirely defined in terms of generic URI-handling and XML loading code and an XML file that contains the program data. An abbreviated version of the XML file I used looks like this:

<gtk.Window id="window1">
        <gtk.Label id="label1" label="literal:Hello,%20world">
                <Event
                        name="button_press_event"
                        url="object:/label1/label"
                        verb="PUT"
                        data="literal:Goodbye,%20world"
                        />
        </gtk.Label>
</gtk.Window>

You can see that the literal URI schem handling is a little clunky, and needs to have some unicode issues thought out. The Event node is triggered by a button_press_event on label1, and sends the literal "Goodbye, world" to lable1's own "label" attribute. You can see some of the potential power emerging already, though. If you changed either the initial label url to data extracte from a database or from http we have immediately developed not a static snapshot in this file but a dynamic beast. The data in our event handler could refer to an object or XML file that had been built up XForms-style via other event interactions. It is possible that even quite sophisticated applications could be written in this way as combinations of queries and updates to a generic store. Only geniunely complicated interactions would need to be modelled in code, and that code would not be clouded by the needless complexity of widget event handling.

It is my theory that if REST works at all, it will work in this kind of programming environment as well as it could on the web. I'm not trying to preclude the use of other languages or methods for controlling the logic of a single window's behaviour, but I am hoping to clearly define the level of complexity you can apply through a glade-like editor. I think it needs to be higher than it is presently to be really productive, but I don't think it should become a software IDE in itself.

All in all my prototype came to 251 lines of python, including whitespace and very minimal comments. That includes code for the event handler (12 lines), a gtk object loader (54 lines, and generic python object actually), and URI resolution classes including http support through urllib2. The object resolver is the largest file, weighing in at 109 lines. It is made slightly more complex by having to deal with pygtk's use of setters and getters as well as the more generic python approaches to attribute assignment.

My example was trivial, but I do think that this kind of approach could be considerably more powerful than that of existing glade. The real test (either of the method or of gtk's design) will probably come in list and tree handling code. This will be especially true where column data comes from a dynamic source such as an sqlite database. I do anticpate some tree problems, where current even handlers often need to pass strange contexts around in order to correctly place popup windows. It may come together, though.

Benajamin

Mon, 2005-Jun-13

Protocols over libraries

As I've been moving through my web pilgrimage, I've come to have some definite thoughts about the use of libraries as opposed to the use of protocols. The classic API in C, or any other language provides a convenient language-friendly way of interacting with certain functionality. A protocol provides a convenient language-neutral way of delegating function to someone else. I've come to prefer the use of a protocol over a library when implementing new functionality.

Let's take a case I've been discussing recently: Name resolution.

Name resolution (mapping a name to an IP address, and possibly other information) has traditionally been handled by a combination of two methods. /etc/hosts combined a static list of hosts that acted as a bootstrap before relevant DNS servcies might be available. For various reasons people on smaller networks found /etc/hosts easier to manage than a bind configuration, so /etc/hosts grew.

As it grew, people found new problems and solutions.

Problem: We need this /etc/hosts file to be consistent everywhere.
Solution: Distribute it using proprietary mechanisms, NIS, or LDAP.

Problem: Our applications don't know how to talk to those services.
Solution: We'll add new functionality to our resolution libraries so every application can talk to them using the old API.

Problem: We don't know what this library should try first.
Solution: We'll create a file called /etc/nsswitch.conf. We'll use it to tell each application through our library whether to look first at the static files, or at the DNS, or at NIS, whatever.

So now there's a library that orchestrates this behaviour, and it works consistently so long as you can link against C. You can implement it natively in your own language if you like, but damn well better track the behaviour of the C library on each platform.

Another way to solve these problems might be:

Problem: We need this /etc/hosts file to be consistent everywhere.
Solution: We'll take a step back here. Let's write a new application that is as easy to configure as /etc/hosts. Maybe it reads /etc/hosts as its configuration.

Problem: Our applications don't know how to talk to those services.
Solution: We'll change our library to talk to our new application via a well-defined protocol. We might as well use the DNS protocol, as it is already dominant for this purpose. As we introduce new ways of deploying our configuration we change only our application.

Problem: We don't know what this library should try first.
Solution: Still create a file called /etc/nsswitch.conf. Just have the application (let's call it localDNS for fun) read that file. Don't make every program installed use the file. Just make sure they speak the protocol.

Because we have a clear language-neutral protocol we can vary the implementation of both client and server gratitously. We can still have our library, but our library talks a protocol rather than implementing the functions of an application. Because this is a simple matter, we can produce native implementations in various language we use and craft the API according to the conventions of that language. We don't have to track the behaviour of our specific installed platform because we know they'll all speak the standard protocol.

localDNS should consume the existing name resolution services. The name resolution library should drop all non-DNS features, including looking up /etc/hosts and caching and the like.

Running extra processes to implement functionality can have both beneficial and deleterious effects. On the positive side, it can make indivdual applications smaller and their code easier to replace. It can act as a thread of the many processes it interacts with and run simultaneously with them on multiple CPU hardware. It can be deployed and managed individually. It's a separate "Configuration Item". On the other hand, extra infrastructure needs to be put in place to make sure it is always available when required, and the process itself can become a bottleneck to performance if poorly implemented.

Benjamin

Sun, 2005-Jun-12

URI Ownership

A URI is a name controlled by an individual or organisation. They get to define a wide sweep of possible constructions and are allowed to map those contructions onto meanings. One meaning is that of the URL lookup mechanisms. Current technology allows a URI owner to vary the IP address that http connections are made to via DNS. DNS also has the capability to specify the port to connect to for a specific service name. It does not specify the protocol to use. That is fixed as part of the URI scheme.

This entry is my gripe list about the SRV records that permit ports to be decided dynamically by the URI owner rather than being encoded statically into the URI itself. I was introduced to this capability by Peter Hardy who was kind enough to respond to some of my earlier posts. Since then I've done some independent research I'd like to put onto the record for anyone still tuned in :)

SRV records were apparently first specified in rfc 2052 in October 1996, later updated by rfc 2782 in Feburary 2000. Its introduction was driven by the idea that the stability and fault tolerance of the SMTP system could be applied to all protocols by enhancing MX Records to deal with services more generically. Perhaps as a side-effect, or as a sweetener for content providers the capability to specify ports other than that normally allocated for a protocol was included. SRV promised to allow providers to move functionality between physical machines more easily, and to handle load balancing and redundancy issues consistently.

Fast forward to the year 2005, and SRV records are still struggling to find acceptance. Despite DNS server support, most client applications aren't coming on board[1]. Despite a bug being raised back in September 1999, Mozilla still does not support SRV. It would seem that content providers have little incentive to create SRV records for existing protocols. Big content providers don't need to have people http connect on ports other than 80, and would find it impractical if they did due to corporate firewalling rules. They aren't concerned about problems in moving functionality between hosts, about redundancy, or about load balancing via DNS. They have their own solutions already, and even if clients started supporting SRV records they would have to hold on to their old "A" records for compatability. With content providers unlikely to provide the records, providers of client software seem unwilling to put effort into the feature either.

The client software question is an interesting and complex one. For starters, the classic name resolution interfaces are no good for SRV. The old gethostbyname(3) function does nothing with ports, and even the newer getaddrinfo(3) function typically doesn't support SRV, although the netbsd guys apparently believe it is appropriate to include SRV in this API. Nevertheless, there is rfc-generated confusion even in pro-SRV circles about when and how it should be used.

To add a little more confusion, we have lists of valid services and protocols for SRV that associate concepts of service and content type instead of service and protocol, separating http for html from http for xul. If you start down that track you might as well give up on REST entirely :)

So what is SRV good for? The big end of town seems to be faring well without and the small end of town (small home and corporate networks) often don't use DNS at all, preferring simple /etc/hosts and /etc/services file constructions distributed via NIS, LDAP, or a proprietary or manual method.

So... I guess I should put together a list of action items. In order to support resolution of port as well as IP as part of URL handling we need to

  1. Use an API that looks like getaddrinfo(3) consistently across our applications and protocols. It must include both domain name and service name
  2. Make sure we use service names that exactly match our URI scheme, eg http and ftp. Don't get into specifying content. That's not the role of this mechansim.
  3. Add support to getaddrinfo for SRV records
  4. Specify the use of SVR records as preferred for all protocols :) Don't wait for an update of the HTTP rfc!
  5. Add support to getaddrinfo for an alternative to our current /etc/hosts and /etc/services files, or an augmentation of them. This alternative must be in the form of a file itself and must be easily distributed via the same means
  6. Perhaps also add support for per-user resolution files

Interestingly, DNS already has mechanisms to allow dynamic update to its records. If it were to be used an application started as a user could update part of a selected zone to announce its presence. There would definately be some security implications, though. Unlike the typical network situation where a whole machine can be assumed to be controlled by a single individual, ports on localhost could be opened by malicious peer individuals. On seeing that a particular user has port 1337 in use, the attacker may open that port after the user logs out in the hope that the next login will trigger the user to access the same port. The trusted program may not be able to update resolution records quickly enough to prevent client applications from connecting to this apparently-valid port. As well as clients being validated to servers, servers must be conclusively validated to clients. This may require a cookie system different to the traditional one-way web cookies.

Back on the subject of resolution, it may be possible to set up a small DNS server on each host that was used in the default resolution process. It could support forwarding to other sources and serving and update of information relevant to the local host's services. It need not listen on external network ports, so would not be a massive security hole... but convincing all possible users to run such a service in a mode that allows ad hoc service starting and registration may still be a stretch. They may already have their own DNS setups to contend with, or may simply trust /etc/hosts more.

Benjamin

[1] Except for Microsoft products, strangely...

Sun, 2005-Jun-05

Naming Services for Ports

I've been going on ad nauseum about the handling of ad hoc user daemons for a few days now. I've been somewhat under the weather, even to the point of missing some of both work and HUMBUG, so some of it may just be the dreaded lurgie talking. On the premise that the line between genius and madness is thin and the line between madness and fever even thinner, I'll pen a few more words.

So I'm looking for a stable URI scheme for getting to servers on the local machine that are started by users. I thought a little about the use of UNIX sockets, but they're not widely enough implemented to gain traction and don't scale to thin-client architectures. It seems that IP is still the way forward, and because we're talking REST, that's HTTP/TCP/IP.

Now the question turns to the more practical "how?". We have a good abstraction for IP addresses that allow us to use a static URI even though we're talking to multiple machines. We call it DNS, or perhaps NIS or LDAP. Maybe we call it /etc/hosts, and perhaps we even call it gethostbyname(3) or some more recent incarnation of that API. These models allow us to keep the https://example.com/ URI, even if example.com decides to move from a 127.0.0.1 IP address to 127.0.0.1. It even works if multiple machines are sharing the load of handling this URI through various clever hacks. It scales to handling multiple machines per service nicely, but we're still left with this problem of handling multiple services per machine. You see, when we talk about https://example.com/ we're really referring to https://example.com:80/.

There's another way to refer to that URI, which may or may not be available on your particular platform setup. It's https://example.com:http/. It's obviously not something you want to type all of the time and is a little on the weird side with its multiple invocation of http. On the other hand, it might allow you to vary the port number associated with http without changing our URI. Because we don't change the URI we can gain benfits for long-term hyperlinking as well as short-term caching mechanisms. We just edit /etc/services on every client machine, and set the port number to 81 instead.

Hrrm... slight problem there, obviously. Although DNS allows the meaning of URIs that contain domain names to be defined by the URI owner, port naming is handled in a less dynamic manner. With DNS, the owner can trust clients to discover the new meaning of the URI for when it comes to actually retrieve data. They bear extra expenses for doing so, but it is worth the benefit.

Let's assume that we'll be able to overcome this client discovery problem for the moment, and move over to the other side of the bridge. You have a process that gets started as part of your login operation to serve your own data to you via a more convenient interface than the underlying flat files would provide. Maybe you have a service that answers SQL queries for your sqlite files. You want to be able to enter https://localhost:myname.sqlite/home/myname/mydatabase?SELECT%20*%20FROM%20Foo into your web browser and get your results back, perhaps in an XML format. Maybe you yourself don't want to, but a program you use does. It doesn't want to link against the sqlite libraries itself, so it takes the distributed application aproach and replaces an API with a protocol. Now it doesn't need to be upgraded when you change from v2.x to v3.x of sqlite. Put support in for https://localhost:postgres/*, and https://localhost:mysql/* and you're half-way towards never having to link to a database library again. So we have to start this application (or a stand-in) for it at start-up. What happens next?

This is the divide I want to cross. Your application opens ports on the physical interfaces you ask it to, and implicitly names them. The trick is to associate these names with something a client can look up. On the local machine you can edit /etc/services directly, so producing an API a program can use to announce its existence in /etc/services might be a good way to start the ball rolling. Huhh. I just noticed something. When I went to check that full stop (.) characters were permitted in URI port names I found they weren't. In fact, I found that rfc3986 and the older rfc2396 omit the possibility of not only the full stop, but any non-digit character. Oh, bugger. I thought I might have actually been onto something... and if I type a port name into Firefox 1.0.4 it happily chopps that part of the URL for me.

Well, what I would have said if I hadn't come across that wonderful piece of news is that once you provide that API you can access your varied HTTP-protocol services on their dynamically allocated port numbers with a static URI because the URI only refers to names, not actual numbers. That would have been a leading off point into how you might share this port ownership information in the small-(virtual)-network and thin-client cases where it would matter.

That's a blow :-/

So it seems we can't virtualise port identification in a compliant URI scheme. Now that rfc3986 has come out you can't even come up with alternative authority components of the URI. DNS and explicit ports are all that are permitted, although rfc2396 allows for alternate naming authorities to that of DNS. The only way to virtualise would be to use some form of port forwarding, in which case we're back to the case of pre-allocating the little buggers for their purposes and making sure they're available for use by the chosen user's specific invocation.

Well, root owns ports less than 1024 under unix. Maybe it's time to allocate other port spaces to specific users. The user could keep a registry of the ports themselves, and would just have to live with the ugliness of seeing magic numbers in the URIs all of the time. It's that, or become resigned to all services that accept connections defaulting to a root-owned daemon mode that perform a setuid(2) after forking to handle the requests of a specific user. Modelling after ssh shouldn't be all that detrimental, although while ssh sessions are long-lived http are typically short. The best performance will always be gained by avoiding the bottleneck and allowing a long-lived server process handle the entire conversion with its client processes. Starting the process when a connection is recieved is asking for a hit, just has using an intermediate process to pass data from one process to another will hurt things.

Another alternative would be to try and make these "splicing" processes more efficient. Perhaps a system call could save the day. Consider the case of processes A, B, and C. A connects to B, and B determines that A wants to talk to C. It could push bytes between the two ad nauseum, or it could tell the kernel that bytes from the file descriptor associated with A should be sent directly to the file descriptor associated with C. No extra context switches would be required and theoretically the interference of B could end up with no further performance hit.

Maybe a simple way of passing file descriptors between processes would be an easy solution to this kind of problem. I seem to recall a mechanism to do this in UNIX Network Programming, but that is currently at work and I am currently at home. Passing the file descriptor associated with A between B and C as required could reduce the bottleneck effect of B.

Oh well, I'm fairly disillusioned as I tend to be at the end of my more ponderous blog entries.

Benjamin

Thu, 2005-Jun-02

User Daemons

The UNIX model of interprocess communication comes in two modes. You have your internal IPC, such as pipes and unix sockets. You have your external IPC, which pretty-much boils down to IP-based sockets.

The internal model works well for some kinds of interactions, but unix sockets (the internal ones) have pretty much faded into obscurity. On the other side of the fence IP-based sockets are now universal. So much so, that IP sockets over the loopback interface are commonly used for what is known to be and what will only ever be communication restricted to the single machine.

If IP sockets are so widely accepted, why aren't UNIX sockets?

I think the primary answer is the the lack of a URI scheme for unix sockets. You can also add to that the unlikiness that Microsoft will ever implement them, and so coding to that model may leave you high and dry when it comes to port to the world's most common desktop operating system.

It's the desktop that should be benfiting most from local sockets. Anything involving multiple machines must and should use IP sockets. This leaves the web as a wide open playing field, but the desktop is still a place where communication is difficult to orchestrate.

Speaking of orchestras, how about a case study?

The symphony desktop is being developed by Ryan Quinn, under the guidance of Jason Spisak of Lycos fame. The general concept is that the desktop is a tweaked copy of Firefox. A "localhost-only" web server serves regular mozilla-consuable content such as Javascript-embedded HTML as the background. You run applications over the top of the background, and bob's your uncle. You can develop applications for the desktop as easily as you can put them together on the web.

A localhost-only web server is presumably one that just opens a port on the loopback interface, 127.0.0.1 in IPv4 terminology. This prevents intruders attacking the web server accessing any data the web server may hold... but what about other users?

It turns out that other users can access this data just fine, so despite the amount of promise this approach holds it is tipped at the post for any multi-user work. Not only can other users access your data (which admittedly you could solve via even a simple cookie authentication system), other users can't open the same port and therefore can't run their own desktop while you are running yours

I've been bitching about this problem of accessing local application data in various forms lately. I'm concerned that if we can't bring the best web principles back to the desktop, then desktop solutions may start migrating out onto the web. This would be a shame (and is alredy happening with the "I'm not corba, honest" SOAP-based web services). I really want to see the URI concept work without having to put one machine on the network for each active user! :)

So what are the requirements when deploying a desktop solution for inter-process communication? Here's a quick list:

  1. The URI for a particular piece of user data must be stable
  2. The URI for a different pices of user data and the data of different users must not clash
  3. The URI scheme must be widely understood
  4. A user must be able to run multiple services, not just one
  5. It should be possible to refer to local files when communicating to the service
  6. The service should run as the user, and have access to the user's data
  7. The service must not grant access to the user's data when their own permissions would not allow access
  8. It should be possible to move data from the "private data" set to the "exposed to the internet" data set easily.

Using IP-based sockets fails on several counts. If you use a static port number for everyone the URIs clash between users. If you use dynamic port allocation the URI changes because the port number becomes part of the URI. If you pre-allocate ports for each user the users won't be able to run extra services without first consulting the super-user. If you don't preallocate them, you may find they're not available when you want them!

These are the kinds of problems unix sockets were really meant to address, using files as their keys rather than IP and port numbers. Unfortunately, without wide acceptance of this model and without a way to encode this access-point information into a URI that is also widely accepted we run aground just as quickly.

Consider the use cases. The first is that I want to be able to access data via another application, because it understands the data and I only understand the interfact to that application. MIME gets us half-way, by telling us which application to run in order to understand the data... but it is currently a user-facing concept. It tells you how to start the program, but doesn't tell you how to talk to it.

The second is the kind of use that symphony is making, where you don't know about the files and all you care about is the interface. You think of the address of this interface as the data itself. It looks like a file, but there is a dynamic application behind it.

I don't think I have any answers right now.

Update: Google desktop search uses port 4664 for its own local web server. This leads to the same issues as the symphony desktop, with clashes between multiple users. Consider the case where Google ported the desktop search to symphonyos. Now you have two separate services from separate vendors that you want to connect to on differnt ports to prevent coupling... but in order to make them available to mutliple users you have to pre-allocate two ports per user. Urggh.

On the other hand, using a IP-based socket solution rather than local sockets does allow you to move to a thin-client architecture where the desktop and Google search are served on the server while clients just run a web browser. Perhaps the only answer is to serve the data of all users from a single port for a specific application, and using internal mechanisms to cleanly separate the data of each user.

Benjamin

Sun, 2005-May-29

Symphony Operating System

For those who read planet.gnome.org even less frequently than I do[1], Nat Friedman has spotted a very neat little desktop mockup by the name of Symphony.

For those who are into user interface design, this looks quite neat. They've done away with pop-up menus on the desktop, replacing them with a desktop that becomes the menu. I suspect this will greatly improve the ability to navigate such menus. Users won't have to hold the mouse down while they do it. They also use the corners of the desktop as buttons to choose which menu is currently in play. Very neat. I'd like to see it running in practice.

Oooh, their development environment Orchestra sounds interesting, too. I wonder how will it will support REST principles, and how it will isolate the data of one user from that of another user...

Benjamin

[1] Actually, Nat is someone I currenly have a subscription to.

Sun, 2005-May-29

Dia

I've spent some of this afternoon playing with Dia. I have played with it before and found it wanting, but that was coming from a particular use case.

At work I've used Visio extensively, starting with the version created before Microsoft purchased the program and began integrating it with their office suite. As I've mentioned previously I use a public domain stencil set for authoring UML2 that I find useful in producing high-quality print documentation. When I used Dia coming from this perspective I found it very difficult to put together diagrams that were visually appealing in any reasonable amount of time.

Today I started from the perspective of using Dia as a software authoring tool, much like Visio's standard UML stencils are supposed to support but with my own flavour to it. Dia is able to do basic UML editing and because it saves to an XML file (compressed with gzip) it is possible to actually use the information you've created. Yay!

I created a couple of xsl stylesheets to transform a tiny restricted subset of Dia UML diagrams into a tiny restricted subset of RDF Schema. I intend to add to the supported set as I find a use for it, but for now I only support statements that indicate the existence of certain classes and of certain properties. I don't currently describe range, domain, or multipicity information in the RDFS, but this is only meant to be a rough scribble. Here's what I did:

  1. First, uncompress the dia diagram:
    $ gzip -dc foo.dia > foo.dia1
  2. Urrgh. That XML format looks terrible:

        <dia:object type="UML - Class" version="0" id="O9">
          <dia:attribute name="obj_pos">
            <dia:point val="20.6,3.55"/>
          </dia:attribute>
          <dia:attribute name="obj_bb">
            <dia:rectangle val="20.55,3.5;27,5.8"/>
          </dia:attribute>
    

    It's almost as bad as the one used by gnome's glade! I'm particularly averse to seeing "dia:attribute" entities when you could have used actual XML attributes and saved everyone a lot of typing. The other classic mistake they make is to assume that a consumer of the XML needs to be told what type to use for each attribute. The fact is that the type of a piece of data is the least of a consumer's worries. They have to decide where to put it on the screen, or which field to insert it into in their database. Seriously, if they know enough to use a particular attribute they'll know its type. Just drop it and save the bandwidth. Finally (and for no apparent reason) strings are bounded by hash (#) characters. I don't understand that at all :) Here's part of the xsl stylesheet I used to clean it up:

      <xsl:for-each select="@*"><xsl:copy/></xsl:for-each>
      <xsl:for-each select="dia:attribute[not(dia:composite)]">
        <xsl:choose>
          <xsl:when test="dia:string">
            <xsl:attribute name="{@name}">
              <xsl:value-of select="substring(*,2,string-length(*)-2)"/>
            </xsl:attribute>
          </xsl:when>
          <xsl:otherwise>
            <xsl:attribute name="{@name}">
              <xsl:value-of select="*/@val"/>
            </xsl:attribute>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each>
      <xsl:apply-templates select="node()">
        <xsl:with-param name="parent" select="$parent"/>
      </xsl:apply-templates>
    

    Ahh, greatly beautified:
    $ xsltproc normaliseDia.xsl foo.dia1 > foo.dia2
    <dia:object type="UML - Class" version="0" id="O9" obj_pos="20.6,3.55" obj_bb="20.55,3.5;27,5.8" elem_corner="20.6,3.55"...
    This brings the uncompressed byte count for my partular input file from in excess of 37k down to a little over 9k, although it only reduces the size of the compressed file by 30%. Most importantly, it is now much simpler to write the final stylesheet, because now I can get at all of those juicy attributes just by saying @obj_pos, and @obj_bb. If I had really been a cool kid I would probably have folded the "original" attributes of the object (type, version, id, etc) into the dia namespace while allowing other attributes to live in the null namespace.

  3. So now that is complete, the final stylesheet is nice and simple (I've only cut the actual stylesheet declaration, including namespace declaration):

    <xsl:template match="/">
    <rdf:RDF>
            <xsl:for-each select="//dia:object[@type='UML - Class']">
                    <xsl:variable name="classname" select="@name"/>
                    <rdfs:Class rdf:ID="{$classname}"/>
                    <xsl:for-each select="dia:object.attributes">
                    <rdfs:Property rdf:ID="{concat($classname,'.',@name)}"/>
                    </xsl:for-each>
            </xsl:for-each>
            <xsl:for-each select="//dia:object[@type='UML - Association']">
                    <rdfs:Property rdf:ID="{@name}"/>
            </xsl:for-each>
    </rdf:RDF>
    </xsl:template>
    

    Of course, it only does a simple job so far:
    $ xsltproc diaUMLtoRDFS.xsl foo.dia2 > foo.rdfs

    <rdf:RDF xmlns...>
      <rdfs:Class rdf:ID="Account"/>
      <rdfs:Property rdf:ID="Account.name"/>
      <rdfs:Class rdf:ID="NumericContext"/>
      <rdfs:Property rdf:ID="NumericContext.amountDenominator"/>
      <rdfs:Property rdf:ID="NumericContext.commodity"/>
    ...
    

My only problem now is that I don't really seem to be able to do anything much useful with the RDF schema, other than describe the structure of the data to humans which the original diagram does more intuitively. I do have a script which constructs an sqlite schema from rdfs, but I really don't have anything to validate the rdfs against. I don't have any program that will validate RDF data against the schema that I'm aware of. Perhaps there's something in the Java sphere I should look into.

The main point, though, is that Dia has proven a useful tool for a small class of problems. Schema information that can be most simply described in a graphical format and is compatible with Dia's way of doing things can viably be part of a software process.

I think this is important. I have already been heading down this path lately with XML files. Rather than trying to write code to describe a constrained problem space, I've been focusing on nailing down the characteristics of the space and putting them into a form that is human and machine readible (XML) but is also information-dense. The sparsity of actual information in some forms of code (particularly those dealing with processing of certain types of data) can lead to confusion as to what the actual pass/fail behaviour is. It can be hard to verify the coded form against a specification, and hard to reverse-enginer a specification from existing code. The XML approach allows a clear specification, from which I would typically generate rather than write the processing code. After that, hand-written code can pass that information on or process it in any appropriate way. That hand-written code is improved in density because the irrelevant rote parts have been removed out into the XML file.

So what this experiment with Dia means to me is that I have a second human- and machine- readible form to work with. This time it is in the form of a diagram, and part of a tool that appears to support some level of extension. I think this could improve the software process even more for these classes of problem.

Benjamin

Fri, 2005-May-20

Project Harmony

Everyone else has gotten their two cents in, so I had better too. I think Project Harmony as proposed is a good thing.

Novell won't back Java[1]. Redhat won't ship Mono. So where does this leave open source?

I think a lot of the current interest in open source didn't come from "Linux" per se, but tools such as Apache and Perl. They were free of cost. They fit with the needs of users. They filled the needs of users better than commercial equivalents, and it was clear that a commercial clone of either would fail.

Apache is still "The number one HTTP server on the Internet"[2], but I think open source is losing the language war. It's been years since exciting things were really happening that commercial software wasn't doing better.

C and C++ are dead languages[3], but they continue to be the foundation of most of the free software out there. The effort that's going into supporting that system of libraries is wasted, because the programmers of today don't really want to talk to you if you don't have garbage collection. They don't want to talk to you if you have a language as complicated as C++[4]. They don't want to talk to you if you can't make DOM and SAX XML models talk to each other. They don't want to talk to you if you can't tell them straight to their face which HTTP client code to use, or which interface specification to code to when they want to write a serverlet. We're playing silly buggers with ideas that should be dead to computer science, and the commercial players are lightyears ahead in delivering actual user functionality.

Perl's CPAN was a killer app when C++ was big. If you wanted an implementation of a particular feature you just picked it up. Now-a-days most such features are part of the standard library of your langauge. Why would you want another source? That's what we're missing.

We're still arguing about which languages we need to use to build core desktop components on! Heavens' sakes! When a programmer opens up the open source toolkit for the first time we can't give them an adequte place to start.

You could code in C, but that's like using punchcards for getting user functions in place. You could code in C++, but that's far too complicated. It won't give you access to basic functionality, nor will it provide an upgrade path for when the next big thing does arrive. Also, if you're not coding in C you have to think very hard about how your code will be reused. Chances are, if it's not in C you can forget about it... but if you do use C you'll spend the rest of your life writing and maintaining bindings for every language under the sun.

We used to be able to solve all our problems with the open source tools we had, but they were smaller problems back then and the productivity we demanded from ourselves was less. Now its hard to know where to start when implementing something as simple as an accounting system.

There are many reasons to get behind Mono. It has the makings of a great piece of technology for delivering user functions. Unfortunately, it has the legal stink surrounding it that comes from being a clone of commerical software from an company that makes money off IP lawyering. It seems that Java is currently unencumbered. Despite Sun's recent cosyness with Microsoft, they seem to be good citizens for now.

From my perspective we either have to pick one of the two horses, or invent a new language again. If we pick that third option we'd better be ready to make it fill needs that can't be filled by the current solutions, or it simply won't be swallowed. If we can't pick Mono, then Java has to be our launching point. Once we have a useful Java platform, then we can think about whether we want to stick with Java longer term or whether we want to fork in the same way Microsoft forked to create C#. Any way we go its a technical and legal minefield out there and we don't really have a 500 pound gorilla to clear the way for us.

I suppose I had better clearly state my position on python, perl and friends going forwards. To be frank I don't hold out much hope. Parrot seems to be stagnant. It has been "in the early phases of its implementation" for as long as I can recall, having had its 0.1 release in December 2001. Without a common virtual machine, I don't see Perl and Python talking to each other freely any time soon. If that virtual machine doesn't also run the code in your mythical compiler-checked uber-language replacement for C# and Java then they won't talk to each other at the library level either. As far as I am concerned it is the the lack of ability to write a library and know it will be able to be used everywhere its needed that is at the core of what I hate about working with open source tools right now. I don't care how easy it is to write in your personal favourite langauge, or how elegant it is. I just want to know whether I'll be spending the rest of my life writing and rewriting it just to end up with the same amount of funcationlity I started with. It's just as frustrating from the library user perspective, too. I'm forever looking at libraries and thinking "I need that functionaliy, I wonder if its available in a python binding?"[5]. I don't care how easy such a binding would be to create. I don't want to maintain it! The code's author probably doesn't want to maintain it either! There has to be a better way!

So... what's left?

I think we end up with three options for making open source software authoring bearable again:

  1. Pick a language, and have everyone use it
  2. Pick a calling convention (including contracts regarding garbage collection, mutability and the like), and have everyone use it
  3. Start from scratch and find another way for pieces of software to communicate with each other (such as http over the local desktop), and have everyone use it.

Given these options, I applaud project Harmony for making the attempt at positioning us with a reasonable starting point. I hope they succeed. Organisationlly, I trust Apache more than GNU with their existing classpath infrastructure to do the job and get it right. Apache have a lot of Java experience, and that includes dealing with through Java Community Process with Sun and other Java interests. On the technical level they seem to be taking a bigger-picture approach than classpath, also. I think they can get a wider base of people onto the same page than the Simultaneously, I think we should continue experimenting with alternative ways for software to communicate. Maybe we'll meet in the middle somewhere.

Benjamin

[1] Ok, that's a huge oversimplification

[2] Accoring to their website

[3] ... or dead langauges walking, on their way to the electric chair

[4] As a C++ programmer I thank the Java luminaries for removing features like operator override!

[5] I consider python the most productive language presently for writing software in the open source sphere. I don't think its a good language once you hit a few thousand lines of code... I need compile-time checks to function as a programmer.

Thu, 2005-May-19

How about local CGI?

Sharing libraries between applications so they can access common functionality is the old black. Talking to an application that emodies that functionality is the new black. It's failed before with ideas like CORBA, and looks to fail again with SOAP, but REST is so simple it almost has to work.

REST can work on the web in well-understood ways, but what about the desktop? I've pondered about this before, but I have a few more concrete points this time around.

The http URI scheme describes clearly how to interact with network services given a particular URL. The file scheme is similarly obvious, assuming you want to simply operate on the file you find. HTTP is currenly more capable, because you can extend it with CGI or servelets to serve non-file data in GET operations and to accept and respond to other HTTP commands in useful ways. File is not so extensible.

This is silly, really, because files are actually much more capable than http interfaces to other machines. You can decide locally which application to use to operate on a file. You can perform arbitrary PUT and GET operations with the file URI, but you know you'll always be dealing with static data rather than interacting with a program behind the URI.

How's this for a thought? Use the mime system to tell you which application to run for a particular URI pointing to the local file system. Drop the "net" part of a http URI for this purpose, so https://home/benc/foo.zip becomes our URL. To look it up, we start the program we associate with HTTP on zip files and start a HTTP conversation with it. Using REST principles this URI probably tells us something about the ZIP file, perhaps its content.

Now comes the interesting part. Let's extend the URI to allow the HTTP zip program to return URI results from within the file. Let https://home/benc/foo.zip/bar.txt refer to the bar.txt inside foo.zip's root directory. We download it via HTTP.

Now everyone who can speak this "local" dialect of HTTP can read zip files. I can read zip files from my web browser, or my email client. So long as they all know how to look up the mime type, start the responsible application, and connect to that application. A completely independent server-side application that offers the same interface for arj files can be written without having to share code with the zip implemenetation, and they can be plugged in without reference to each other to increase the functionality of programs they've never heard of.

The tricky bits are

  1. Deciding on a URI scheme.
    The use of http may cause unforseen problems
  2. Deciding how to connect to the program.
    Hopefully looking up the mime rules will be simple and standard for a given platform, but you also need to connect to the app you start. It could be done with pipes (in which case you would need a read pipe and a write pipe rather than the traditional bi-directional network socket), or via some other local IPC mechanism.
  3. Deciding how to connect to the program.
    Maybe the actual mime side of things is more contraversial than I'm making out. To be honest, my suggestion is not 100% in line with how web servers themselves would handle the situation. There are probably suggestions to be made and incorporated into the model if it is to be quite as functional as these beasties.
  4. Implementing it everywhere.
    If the functaionlity provided by this interface was so killer compared to what you could achieve across the board by reusing libraries I think it would catch on pretty quickly. Just think about the hours wasted in building language bindings for C- libraries these days. Microsoft want to get rid of the problem by having everyone agree on a common virtual machine and calling convention. Others see a problem with using Microsoft's common virtual machine ;) Perhaps HTTP is so universal that it would see adoption on the desktop, just as it has seen universal adoption on the Internet itself.

Oh, and don't forget to include HTTP subscription in the implementation ;)

Benjamin

Wed, 2005-May-18

We're hiring

The company I work for[1] is hiring

Benjamin

[1] I don't know. You tell me which page to link to...

Sun, 2005-May-15

Coding for the compiler

In my last column I wrote about deliberately architecting your code in such a way that the compiler makes your code more likely to be correct. I used an example lifted from Joel Spolksy regarding "safe" and "unsafe" strings. In indicated they should be made different types so the that the compiler is aware of and can verify that the rules associated with assining one to the other or using one in place of the other are followed. I thought I'd bring up a few more examples of coding conventions I follow to use the compiler to best effect.

Don't initialise variables unless you have a real value for them

Junior programmers are often tempted to initialise every variable that they declare, and do so immediately. They do so to avoid messy compiler warnings, or because they think they are making the code cleaner and safer by providing a value where there would otherwise be no value.

That's a bad idea. Whether it be initialising an integer to zero or a pointer to null whenever you give a value to a variable you are lying to your tools. You're giving a value to the variable that no use to your program, but at the same time you are quelling the concerns of both your compiler and any memory analyser you might be using (such as purify).

Don't declare variables until you have a real value for them

This rule answers the question about how you deal with the warnings that would come from not assigning values to your variables. The simple pseudo-code you should always follow when declaring a variable is either one of:

MyType myVariable(myValue);

or

MyType myVariable;
if (condition)
{
	myVariable = myValue1;
}
else
{
	myVariable = myValue2;
}

Any other form is pretty-well flawed. In the first case, the variable is always assigned to a useful value, and it happens immediately. In the second case, every branch must make a concious decision about what to set your variable to. Sometimes variables come in pairs, or other tuples that need to be initialised together. Sometimes your conditions and branching are much more complicated than in this case. By not initalising the varaible up front and initialising in every branch you allow the compiler to find any cases you might have missed. If you set an initial value for the variable before a conditional that may assign values to it you stop the compiler complaining about branches where the variable is not assigned to, and you allow the introduction of bugs through unconsidered branches.

Always declare your else branch

Every "if" has an implicit "else". You need to consider what happens in that case. The variable assignment structure I've laid out forces you to declare your else branch(es) with the content you might otherwise have placed in the variable's declaration. This is a good thing. Considering all branches is a critial part of writing good code, and it is better to be in the habit of doing it than in the habit of not. Even if you don't have content for one side of the branch or the other, place a comment to that effect in the branch. Explain why. This point won't help the compiler find bugs in your code, but it is good to keep in mind.

Only assign to "external" variables once per loop iteration

This is a bit hard to write down as a hard and fast rule, but it is another one of those variable initialisation issues. Sometimes you'll have a loop which changes a variable declared outside of the loop. You'll be tempted to add one or subtract one at various points in the loop, but it's the wrong approach if you want help from the compiler. Instead, declare a variable which embodies the effect on the external variable. Use the same technique as I've outlined above to allow the compiler to check you've considered all cases. At the end of the conditionals that make up the body of your loop, apply the effect by assignment, addition, or whatever operation you require.

string::size_type pos;
do
{
	string::size_type comma(myString.find(',',pos));
	string::size_type newpos;
	if (comma == string::npos)
	{
		// None found
		newpos = string::npos;
	}
	else
	{
		myString[comma] = ';';
		newpos = comma+1;
	}
	pos = newpos;
}
while (pos != string::npos);

On the subject of loops...

Use a single exit point for loops and functions

When you have a single exit point, your code is more symmetrical and that tends to support the treatement of variables and cases that I'm suggesting. What you're looking for is...

  1. To make sure you clearly enumerate and consider every case your function has to deal with, and
  2. To give the compiler every opportunity to spot effects you may not have specified for each case

It is these principles at the core of my general mistrust of exceptions as a general unusual case-handling mechanism. I prefer explicit handling of those cases. I won't go over my position on exceptions again just yet... suffice as to say they're certianly evil under C++ and I'm still forming an opinion as to whether the compiler-support exceptions recieve under Java is sufficient to make them non-evil in that context :). Give me a few more years to make up my mind.

Under C++, I consider the following loop constructs to be useful:

Pre-tested loops

while (condition) { /* body */ }
for (
	iterator it=begin();
	it!=end();
	++it
	) { /* body */ }
for (
	iterator it=begin();
	it!=end() && other-condition;
	++it
	) { /* body */ }

Mid-tested loops

for (;;) { /* body */; if (condition) break; /* body*/ }

Post-testd loops

do { /* body */ } while (condition);
for (;;) { /* body */; if (loop-scoped-condition) break; }

Don't use default branches in your switch statements

This goes back to general principle of making all your cases explicit. When you change the list of conditions (ie. the enumeration behind a switch statement) the compiler will be able to warn you of cases you haven't covered. This isn't the case when you have a default. In those cases you'll have to find a way to search for each switch statement based on the enumeration and assess whether or not to create a new branch for it.

I'm happy for multiple enumeration value to fold down to a single piece of code, for example:

switch (myVariable)
{
case A:
	DoSomething1();
	break;
case B:
	DoSomething2();
	break;
case C:
case D:
case E:
case F:
	// Do nothing!
	break;
}

When you add "G" to the condition the compiler will prompt you with a warning that you need to consider that case for this switch statement.

You have a typing and classing system, so use it!

This is the point of my previous entry. Just because two things have the same bit representation doesn't mean they're compatible. Declare different types for the two and state exactly what operations you can do to each. Don't get caught assigning a 64-bit integer into a 32-bit slot without range checking and explicit code ok-ing the conversion. Wrap each up in a class that stops you doing stupid things with them.

Use class inheritence sparingly

By class inheritence I mean inheritence of class B from class A, where A has some code. In Java this is called extension. It is usually safe to say that class B inherits interface A, but not class A. When you inherit from a class instead of an interface you are aliasing two concepts:

  1. What you can do with it
  2. How it works

Sharing what you can do with an object between multiple classes is the cornerstone of Object-Oriented design. Sharing how classes work is something I don't think we have a syntactically simple solution to, even after all these years. Class inheritence is one idea that requires little typing, but it leads to finding out late in the piece that what you want to do with two classes is the same, but their implementation needs to be more different than you expected. At that point it takes a lot more typing to fix the problem.

So declare interfaces and inherit from interfaces where possible instead of from classes. Contain objects, and delegate your functions to them explicitly ahead of inheriting from ther classes. Sometimes class inheritence will be the right thing, but in my experience that's more like five cases out of every hundred.

I've strayed off the compiler-bug-detection path a little, so I'll finish up with two points.

Use the maximum useful warning settings

They're there for a reason

Treat warnings as errors

There's nothing worse than seeing 100 warnings when you compile a single piece of code, and have to try and judge whether one of them is due to your change or not. Warnings become useless if they are not addressed immediately, and they are your compiler's best tool for telling you you've missed something. Don't think you're smarter than the machine. Don't write clever code. Dumb your code down to a point the compiler really does understand what you're writing. It will help you. It'll will help other coders. Think about it.

Benjamin

Sat, 2005-May-14

Hungarian Notation Revisited

I don't subscribe to Joel Spolksy's blog Joel On Software, but I am subscribed to the blog of Sean McGrath. He has interesting things to say that differ from my opionions often enough to be a source of interest. This week he linked to Joel's article revisiting Hungarian notation, a very interesting read in which he dispells the notion that hungarian was ever meant to convey ideas like "int" and "double". Instead, he revisits the early incarnations of this idea and uses it to convey concepts such as "safe", "unsafe", "row identifier", and "column identifier".

I like Sean's python-centric view of the world, but I disagree with it. Joel's article focuses on a few examples, one of which is ensuring that all web form input is properly escaped before presenting it back to the user as part of dynamic content. He talks about using a "us" prefix for unsafe string variables and functions that deal with strings that haven't yet been escaped. He uses a "s" prefix for safe strings that have been escaped. Sean takes this to mean that all typing is poorly construed and that python's way is right. He ponders that maybe we should be using this "classic" hungarian to specify the types of variables and functions so that programmers can verify they are correct on a single line without referring to other pieces of code.

Personally, I take the opposing view. I don't think that programmers should be the ones who have to check every line to make sure it looks right. That is the job of a compiler.

My view is that Joel is wrong to use a naming convention to separate objects of the same "type" (string) into two "kinds" (safe, and unsafe). That is exactly the reason strongly-typed object-oriented languages exist. By creating a safe-string and an unsafe-string class it is possible for the compiler to do all of the manual checking Joel implies in his article.

I see python's lack of ability to state the expectations a paritcular function has on its parameters as flawed for "real work". It may be that the classic strongly-typed OO Language constructs aren't the right way to do it, but it is clear that expectations do exist of a functions parameters. The function expects to be able to perform certain operations on them, and as yet python doesn't have a way for a compiler or interpreter to analyse those expectations and see they are met.

Joel lists four levels of maturity for programmers in a particular target language.

  1. You don't know clean from unclean.
  2. You have a superficial idea of cleanliness, mostly at the level of conformance to coding conventions.
  3. You start to smell subtle hints of uncleanliness beneath the surface and they bug you enough to reach out and fix the code.
  4. You deliberately architect your code in such a way that your nose for uncleanliness makes your code more likely to be correct.

I would add a fifth:
You deliberately architect your code in such a way that the compiler makes your code more likely to be correct.

Joel uses hungarian to train his level four code smells nose. I use types to do the same thing. I think my way is better when the capability exists in the language. In python, it seems like the use of Hungarian Notation fills a real need.

Benjamin

Sun, 2005-May-08

Gnome 2.10 Sticky Notes

According to the Gnome 2.10 "what's new":

sticky notes now stay on top other windows, so you can't lose them. To hide the notes, simply use the applet's right-click menu.

So now sticky notes have two modes:

  1. In my way, or
  2. Out of sight, out of mind

I do not consider this a feature. If you are using sticky notes with Gnome 2.8 or earlier, I do not recommend upgrading!

Update: I've placed a comment on the existing bug in Gnome bugzilla.

Benjamin

Sun, 2005-May-08

Describing REST Web Services

There has been some activity lately on trying to describe REST web services. I don't mean explain, people seem to be getting the REST message... but to describe them so that software can understand how to use them. I saw this piece during early april. The idea is to replace the SOAP-centric WSDL with a REST-centric language. Essentially it describes how to take a number of parameters and construct a URI from them. Here is their example:

<bookmark-service xmlns="http://example.com/documentation/service">
	<recent>https://example.com/{user}/bookmarks/"</recent>
	<all>https://example.com/{user}/bookmarks/all"</all>
	<by-tag>https://example.com/{user}/bookmarks/tags/{tag}/"</by-tag>
	<by-date>https://example.com/{user}/bookmarks/date/{Y}/{M}/"</by-date>
	<tags>https://example.com/{user}/config/"</tags>
</bookmark-service>

Frankly, I don't agree. I've had some time to think about this, and I am now prepared to call it "damn foolish". This is not the way URIs are supposed to be handled. Note the by-date URI, which has an infinite number of possible combinations (of which some finite number will be legal, for the pedants). Look at the implicit knowledge required of which user parameters and which tag parameters are legal to use. It's rubbish.

There are two things you need to do to make things right. First off, there is a part of the URI which is intended to be constructed by the user agent rather than the server. This part is unremarkably called the query part, and exists after any question mark and before any fragment part of a URI reference. I consider this part to be the only appropriate space for creating URIs that are subject to possibly infinite variation. The {Y} (year) and {M} (month) parameters must follow a question-mark, as must {tag}.

The second thing you need to do is use RDF to describe the remaining (finite) set of URIs. Somewhere at the top of the URI path hierarchy you create a resource that lists or allows query over the set of users. If that's not appropriate, you give your user an explicit URI to their representation in the URI space. I, for example, might reside at "https://example.com/fuzzyBSc". At that location you place a resource that explains the rest of the hierarchy, namely:

<bookmark-service
	xmlns="http://example.com/documentation/service"
	rdf:about="https://example.com/fuzzyBSc"
	>
	<recent rdf:resource="https://example.com/fuzzyBSc/bookmarks/"/>
	<all rdf:resource="https://example.com/fuzzyBSc/bookmarks/all"/>
	<by-tag rdf:resource="https://example.com/fuzzyBSc/bookmarks/tags/"/>
	<by-date rdf:resource="https://example.com/fuzzyBSc/bookmarks/date/"/>
	<tags rdf:resource="https://example.com/fuzzyBSc/config/"/>
</bookmark-service>

Separately, and tied to the rdf vocabulary at "https://example.com/documentation/service" you describe how to construct queries for the by-tag and by-date resources. I don't even mind if you use the braces notation to do that (although clearly more information than just the name this method provides will be required, for example a date type wouldn't go astray!):

<URIQueryConstruction>
	<!-- tag={tag} -->
	<Param name="tag" type="xsd:string">
</URIQueryConstruction>
<URIQueryConstruction>
	<!-- Y={Y}&M={M} -->
	<Param name="Y" type="xsd:integer">
	<Param name="M" type="xsd:integer">
</URIQueryConstruction>

Alternately, you make https://example.com/fuzzyBSc/bookmarks/tags/ contain a resource that represents all possible sub-URIs. Again, this would be a simple RDF model presumably encoded as an RDF/XML document. Another alternative again would be to incorporate the tags model into the model at "https://example.com/fuzzyBSc/".

So I know that https://example.com/fuzzyBSc/bookmarks/tags/?tag=shopping (or https://example.com/fuzzyBSc/bookmarks/tags/shopping at the implementor's discretion) will lead me to shopping-related bookmarks owned by me, and that https://example.com/fuzzyBSc/bookmarks/date/?Y=2005&M=05 will fetch me this month's bookmarks.

Personally, I think that some REST proponents are confusing (finite) regular URI and (infinite) query URI spaces too much. There are important distinctions between them, particularly that you can deal with finite spaces without needing client applications to construct URIs using implicit knowledge that might be compiled into a program based on an old service description or explicit knowledge gleaned from the site.

The RDF approach (when available) allows sites to vary the construction of these finite URIs from each other so long as they maintain the same RDF vocabularies to explain to clients which resources serve their specific purposes. The substitution approach must always follow the same construciton or a superset of a common construction. This is probably not so important for a very specific case like this bookmark service, but may be important for topics of wider interest such as "how do I find the best air fare deal?". Using the RDF approach allows you to navigate using a very general scheme with few constraints on URI construction to sites that require detailed knowledge to be shared between client and server to complete an operation.

Update: A link summarising various current attempts

Benjamin

Sat, 2005-Apr-30

On URI Parsing

Whenever you find you have to write code that parses a URI, DON'T start with the BNF grammar in appendix A of the RFC. A character-by-character recursive descent parser will not work. If you are silly enough to keep pushing the barrow and translate the whole thing into a yacc grammer your computer will explain why. URIs are not LALR(1), or LALR(k) for that matter. They would be if you could tokenise the pieces before feeding them into the parser generator, but that you can't because the same sequence of characters (eg "foo") could appear as easily in the scheme as in the authority part.

Instead, skip to appendix B and take the regular expression as your starting point. If you'd been smart enough to look at it closely the first time you would have realised you need to apply some kind of greedy consumption of characters to it. Start at the beginning, and look for a colon. Next, look for a slash, question-mark or hash. Finally look for hashes and question-marks to delimit the final pieces. Much easier, and no need to waste a whole day on it like I did! :)

I actually spent three days all up. The first was churning parsing techniques. The second was churning design and API. Finally, it was cleaning up and adding support for things like relative URIs. Phew, and minimal documentation update required.

I put my library together as part of RDF handling/generating work I've been assembling. My hope is that by putting my data in an RDF-compatible XML form I can increase its utility to other applications that might cache and aggregate results of the program I'm writing. Working this long on URIs specifically, though, has firmed an opinion I have.

A URI is generally defined as scheme:scheme-specific-part. The scheme determines how the namespace under the scheme is laid out, and has implications on the kinds of things you might be able to do with the resource. A "generic" URI is defined as scheme://authority/path?query#fragment (fragment is actually a bit special). If the generic URI is IP-based, then the authority section looks like userinfo@host:port. A http URI implies an authority (the DNS name and port), a path, and an optional query. When you give someone a http URI you are telling them that you own the right to use that DNS name and port, and to structure the paths and queries beneath that DNS name. Up until http was used for non-URL URNS, you were also giving them a reasonable surety they could put the URI into a web browser to find out more.

http is obviously -not- the only scheme. Ftp is almost identical. It is IP-based and generic also, so uses the same DNS (or IP) authority scheme. It is perhaps more likely to include a userinfo part to its authority than that of http. When you give someone an FTP URI they can almost certainly plug it into a web browser, and what they get back will depend on the path, the DNS name, and the userinfo (they may be looking at files relative to userinfo's home area). You could define any number of other schemes. You could define a mysql scheme which identified a database in the same way and used (escaped) SQL in the query part. You could define your own scheme to break away from DNS to utilise the technically much better ENS (Eric's Naming Service? :)).

As far as I'm concerned, non-deferfernceable URNs should follow the same pattern of any IP-based generic URI. That's why the generic URI idea exists in the rfc, so we don't have to reinvent the wheel every time we come up with a new hierarchical namespace. We especially don't have to reinvent the wheel whenever we come up with a new IP-based hierarchical namespace, but we should use different schemes when we're implying different resources. When I give someone a URN that I know is not a URL (it doesn't point to anything) I should be flagging that to them with an appropriate scheme identifier. When I give them a "urn" URI with an authority component I want them to know that I am identifying something, I do own the DNS part and the right to create paths within that DNS namespace and scheme, and am also transferring my knowledge that there's nothing at the other end. That's useful information, particularly if they're some kind of automated crawler bot or for that matter some kind of web browser. The crawler bot would know not to look up the URI as if it were a http URL, and the web browser would know to explain the problem to their user or to start searching for information "about" the URI instead of "at" the URI.

I think that RDF URIs that aren't URLs but don't tell anyone that they aren't URLs are broken and wrong. I really don't see the point of alaising the different concepts together in such a confusing and misleading way. Is it for future expansion? "Maybe I'll want to put something that the URI some day". It doesn't sound very convincing to me. If you really want to do that, you're changing its meaning slightly anyway, and you can always tell everyone that the new URI is owl:sameAs the old one. I really don't understand it, I'm afraid.

Benjamin

Tue, 2005-Apr-12

A considered approach to REST with RDF content

I wrote recently about beginning to implement a REST infrastructure, and had some thoughts at the time about using RDF as content to enable browsing and crawling as part of the model. This is a quick note to tell you what I've come up with so far.

URIs

URIs in RDF files can be dereferenced by default, and you will usually find RDF content at the end point. Other content can also be stored, and you can tell the difference by looking at the returned Content-Type header.

GET

GET to a URI returns an RDF document describing all outgoing arcs from the URI. The limits of the returned graph are either the first named node encountered, or the first cycle. The graph's root is "about" itself. A simple object with non-repeating properties is represented something like:

GET /some-rdf HTTP/1.1
Host: example.com

<MyClass rdf:about="">
<MyClass.myProperty>myValue</MyClass.myProperty>
<MyClass.myReference rdf:reference="https://example.com/more-rdf"/>
</MyClass>

From this input it is possible to crawl over an entire RDF data model one object at a time. The mode itself may refer to external resources which can be crawled or left uncrawled. From this perspective, the RDF model becomes an infinitely-extensible and linkable backbone of the web and can replace HTML in this role.

PUT

A PUT should seek to set the content of a URI without respect to prior state. In this case, I believe this should mean its semantics should attempt to create the subject URI. It should remove all existing assertions with this URI as the subject, and it should establish a new set of assertions in their place:

PUT /some-rdf HTTP/1.1
Host: example.com

<MyClass rdf:about="">
<MyClass.myProperty>myNewValue</MyClass.myProperty>
<MyClass.myReference rdf:reference="https://example.com/more-rdf"/>
</MyClass>

If PUT was successful, subsequent GET operations should return the same logical RDF graph as was PUT.

POST

While PUT should seek to set the content without respect to prior state, POST should seek to supliment prior state. To this end, it should perform a merge with the existing URI graph. The definition of this merge should be left to the server side, but in general all arcs in the POST graph should be preserved in the merged graph. Where it is not valid to supply multiple values for a property or where conflict exists between new and existing arcs the new arcs should take precedence.

PUT /some-rdf HTTP/1.1
Host: example.com

<MyClass rdf:about="">
<MyClass.myProperty>myNewValue</MyClass.myProperty>
</MyClass>

DELETE

DELETE should seek to remove all outgoing arcs of the URI, possibly erasing server knowledge the URI as a side-effect. Incoming arcs should not automatically affected, though... except as the server chooses to enforce referential integrity constraints.

Where we are now

I believe this gives us a fairly simple mechansim for uniformly exposing object properties such as the properties of a Java bean. Beans can then behave as they consider appropriate in response to changes. They themselves may propagate changes internally or may make further http requests. Interally, there should probably be an abstraction layer in place that makes the difference invisible (by hiding the fact that some objects are internal, rather than hiding the fact that some are external). Under this model objects can interact freely with each other across platforms and across language barriers.

Now, the big question is whether or not I actually use it!

So far I've been using an XAML encoding for object properties. It looks something like this:

<MyClass xmlns="http://example.com/ns" attr1="foo" attr2="bar">
<MyClass.myWhitespacePreservingProperty> baz </MyClass.myWhitespacePreservingProperty>
<MyClass.myObjectSubProperty><MyOtherClass/></MyClass.myObjectSubProperty>
</MyClass>

With RDF it would look like this:

<MyClass rdf:about="" xmlns="http://example.com/ns">
<MyClass.attr1>foo</MyClass.attr1>
<MyClass.attr2>bar</MyClass.attr2>
<MyClass.myWhitespacePreservingProperty> baz </MyClass.myWhitespacePreservingProperty>
<MyClass.myObjectSubProperty><MyOtherClass/></MyClass.myObjectSubProperty>
</MyClass>

or this:

<ex:MyClass rdf:about="" xmlns:ex="http://example.com/ns" ex:MyClass.attr1="foo" ex:MyClass.attr2="bar">
<ex:MyClass.myWhitespacePreservingProperty> baz </ex:MyClass.myWhitespacePreservingProperty>
<ex:MyClass.myObjectSubProperty><MyOtherClass/></ex:MyClass.myObjectSubProperty>
</ex:MyClass>

I have two problems with the RDF representation. The first is simple asthetics. I like my XAML-based notation with object attributes in XML attributes. It's simple and easy to read I don't even have to refer to an external RDF namespace. The fact that XML attributes don't fall into the default namespace of their enclosing elements does not help things, meaning each must be prefixed explicitly. The second is transformation. Throwing all of those namespaces into attributes means I have to seed my XPath engine with QName prefixes like "ex:". I'd really rather deal with no namespaces, or just one. Internally I'm actually merging this information with a DOM that contains data from many other sources as well. Some are gathered over the network. Some are from user input. In merging these document fragments together I want to avoid removing existing elements (processing may be triggered from their events), and so my merge algorithm attempts to accumulate attributes for a single conceptual object together to be applied to beans or other objects.

Hrrm... I'm not making a very strong argument. I'd just feel happier if the XML rdf encoding treated @attr1 as equivalent to the sub-element ClassName.attr1 in the parent node's namespace. That would fit better to object encoding. Leave the references to external namespaces to explicit child elements and let the attributes stay simple.

Oh, well...

Benjamin

Sat, 2005-Apr-09

What do you store in your REST URIs?

I have been tinkering away on my HTTP-related work project, and have a second draft together of an interface to a process starting and monitoring application that we built and use internally. Each process has a name, and that is simple to match to a URI which contains read-only information about its state.

You can picture it in a URI like this: https://my.server/processes/myprocess, returning the XML equivalent of "the process is running, is is currently the main in a redundant pair of processes". It actually gets a little more complicated than that, with a single HTTP access point able to return the status of various processes across various hosts, to report on the status of the hosts themselves, and also to arrange the processes in various other ways that are useful to us and make statements about those collections.

My next experimental step was to allow enabling and disabling of processes by adding a */enabled uri for each process. When enabled it would return text/plain "true". When disabled it would return text/plain "false". A PUT operation would change this state and cause the process to be started or stopped. I was hoping I'd be able to access this via a HTML form, but urg... no luck there. I had to add a POST method to the process itself with an "enabled=true" uri-encoding. Not nice, but together they're workable for now.

Now we're at the point where I ask the question: How do I find and represent the list of processes? I ask, "How do I navigate to the Host URI associated with this process?". I ask, "How do I know what to append to find the enabled URI?".

I have been returning pretty basic stuff. If my data is effectively a string, (ie, a datum encoded using XSD data type rules) I've been returning a text/plain body containing that data. If the data is more complicated, and needs an XML representation I've been returning application/xml with the XML document as the body. In my HMI application I typically map those strings and XML elements onto the properties of java beans, or onto my own constructions that map their data eventually onto beans. The expected data format is therefore pretty well known to my application and doesn't need much in the way of explicit schema declaration. The URIs are also explicitly encoded into the XML page definitions that go into the HMI. If I start to look outside the box, though, particularly to debugging- or browsing- style applications that might exist in the future I want to be able to find my data.

As I was working through the problem, I started to understand for the first time where the XLink designers were coming from. My fingers were aching to just type something like

<Processes type="linkset"><a href="foo"/><a href="bar"/></Processes>

and be done with it. XLink is dead, though, and apparently with good reason... so it starts to look like RDF is the "right way to do it".

Rethinking the REST web service model in terms of RDF is an interesting approach, and one I feel could work fairly nicely. I'm still thinking in terms of properties. If I had an object of class foo with property bar, then I could write the following fairly easily:

<foo rdf:about=".">
<foo.bar>myvalue</foo.bar>
</foo>

That's almost identical to the verbose form of the XML structures I'm using right now (I would currently put myvalue into foo/@bar to reduce the verbosity). In this way, the content of each URI would be the rdf about that URI. If this were backed by a triplestore, you might simply list all relationships that the URI has directly in this response body.

It seems simple to produce an rdf-compatible hyperlinking solution for the GET side of things, so what about the PUT?

On first glance this looks simple, but in fact the PUT now needs more data than it did previously. What I really want to do is to PUT an enabled assertion between my process and the literal "false". What do I put, exactly? Perhaps something like this:

PUT /processes/myprocess/https://my.company/myNamespace/myClass.myProperty HTTP/1.1
Host: my.server

<Literal>true</Literal>

You can see the difficulty. I need to encode the URI of the subject (https://my.server/processes/myProcess) and the URI of the predicate (https://my.company/myNamespace/myClass.myProperty). Finally I need to encode the object, clearly identified as a URI, a literal, or a new RDF instance with its own set of properties.

Another thing you need to do is work out the semantics of the PUT operation as well as the POST operation. In the truest HTTP sense it is probably sensible for PUT to attempt to overwrite any existing assertions on the object with the same predicate, while POST would seek to accumulate a set of assertions by adding to rather than overwriting earlier statements.

There is another question unanswered in all of this. If I have a piece of RDF relating to a specific URI, what do I have to do to get more information about it? Sometimes you'll be able to deference the URI and find more RDF. Sometimes you'll get a web page or some other resource, and if you're luckly you'll find rdf at a ".rdf"-extension variant of the filename. Sometimes you'll find nothing at the link. Shouldn't these options be formalised somewhere? I don't think it's possible to write an rdf "crawler", otherwise... or the source RDF document must point to both the target rdf and the related object. In other questions arising from this line of thought, "Is there a standard way to update RDF in a REST way? If so, does it work form a web browser with simple web forms?"

The web browser is becoming my benchmark of how complicated a thing is allowed to be. If you can't generate that PUT from a web form, maybe you're overthinking the problem. If you can browse what you've created happily in mozilla, perhaps you need to simplify it.

Links:

Benjamin

Sun, 2005-Apr-03

Wouldn't it be nice to have one architecture?

We seem to be evolving towards a more open model of interaction between software components. Client and server are probably speaking HTTP to each other to exchange data in the form of XML, or some other well-understood content. Under the REST architectural style the server is filled with objects with well-known names as URIs. Clients GET, and sometimes PUT but will probably POST to the server.

The uri determines five things things. The protocol looks something like https://. The host and port number look something like my.host:80, or maybe my.host:http. The object name looks like /foo/bar/baz, and the query part looks like ?myquery. That's fine for http over a network with a well-known host and port name. I think it might fall down a little in more tightly-coupled environments.

Let's take it down a notch from something you talk to over the internet to something on the local LAN. A single server might offer multiple services, perhaps it could provide not just regular web data to clients but provide information about system status such as the state of various running processes. Perhaps it has an application that provides access to time data to replace the old unix time service. Perhaps it has an application to provide a quote of the day, or a REST version of SMTP. The server is left with an unpleasant set of options. It can let the programs run independently, each opening distinct ports in the classic UNIX style. The client must then know the port numbers, and needs to negotiate them out of band (IANA has helped us do this in the past). If that's no good, and you want to have a single port open to all of these applications you start to introduce coupling. You either operate like inet or a cgi script and exec the relevant process after opening the connection, or you make all of your processes literally part of the one master process using serverlets.

Not so bad, you say. There are still options there, even if the traditional web approach and the traditional unix approaches differ. You can even argue that they don't differ and that unix only ever intended to open different ports when different protocols are in use. We've now agreed on a simple standard protocol that everyone can understand the gist of, even if you need to know the XML format being transported intimately to actually extract useful data out of the exchange.

In a way, the REST concept introduces many more protocols than we are used to dealing with. Like other advances in architecture development it takes out the icky bits and says: "Right! This is how we'll do message exchange". It then leaves the content of the messages down to individual implementors of problem domains to work out for sure. It builds an ecosystem with a common foundation rather than trying to design a city all in one go.

Anyway, back to the options. When you have multiple applications within a single server the uncoupled options look grim. How do I let my client know that the service they want is available on port 8081? Dns allows me to map my.host to an IP address, but does not cover the resolution of port identifiers. That's left to explicit client-side knowledge, so a client can only reasonably query https://my.host:dict/ if we have previously agreed that dict should appear in their /etc/services file. It's much more likely that we can agree to a URI of https://my.host/dict on the standard HTTP port of 80.

This leaves us with the options of either having an inet-equvalent process starting new a new process for each connection made to the server, or making the application a serverlet. The first option is unsatisfying because it doesn't allow a single long-running program to answer the data, and we need to introduce other interprocess communication mechanisms such as shared memory if forked instances of the same process want to share or distribute processing. You can see this conflict in in application like SAMBA. You get a choice between executing via inet for simplicity and ease of adminstration or executing as standalone processes for improved performance. The second option is to me fairly unsatisfying because it introduces coupling between otherwise unrelated applications. In fact, there's a third option. You could have the server process answer the queries by itself quering back-end applications in weird and wonderful ways. That approach is limited because the server may become both a bottleneck and a single point of failure. When all of the data in your system flows through a single process... well... you get the point.

You can see where I'm headed. If I'm uncomfortable with how you would offer a range of different services in a small LAN scenario, imagine my disquiet over how applications should talk to each other within the desktop environment!

I think the REST architecture remains sound. You really want to be able to identify objects some of which may be applications... others of which may represent your files or other data. You want to be able to send a request that reads something like local://mydesktop/mytaxreturn.rdf?respondwith=taxable-income. There's some sensitive data in this space, so you may feel as I do that opening network sockets to access it is a bit of a worry. Even opening a port on 127.0.0.1 may allow other users of the current machine to access your data. A unix named pipe might work, but may not be portable outside of the unix sphere and may be hard to specify in the URL. After all, how you say "speak http to file:///home/me/desktopdata, and request the tax return uri you find there"? You also start running into the set of options for serving your data that you had with the small LAN server. How do you decouple all of the services behind the access-point name in your URI?

So, let's start again and try to abstract that REST architecture. To me it appears decomposable into the following applications:

  1. A client with a request verb and a URI including protocol, access point, and object identifier
  2. An access point broker that can interpret the access point specification and return a file descriptor
  3. A server with a matching URI

It seems that DNS is a fine access point broker for web servers that all live on the same port. An additional mechanism might still be useful for discovering the port number to connect to by name when multiple uncoupled services are on offer. A new access point broker would be needed for the desktop. A new URI encoding scheme might be avoidable if the access broker is able to associate a particular named pipe with a simpler name such as "desktop:kowari", making a whole address look like https://desktop:kowari/mydatabase. Clients would need to be updated to talk to the appropriate access point provider, which I suggest would have to be provided through a shared library like the one we currently use with DNS. Servers would need to open named pipes instead of network sockets, and may need additional protocol to ensure one file descriptor is created per local "connection".

The definition of the access point is interesting in and of itself. What happens when access point data changes? Can that information be propageted to clients so they know to talk to the new location rather than the old? Can you run the same service redundantly so that when one fails the information of the second instance is propagated and clients fail over without spending more than a few seconds thinking about it?

REST is an interesting model for coordinating application interactions. It seems to work well in the loosely-couple large scale environments it was developed for. I like to see it work on the smaller scale just as well, and to see the difference made transparent to both client and server.

Benjamin

P.S. Is it just me, or is there no difference between SOAP over HTTP and REST POST? In fact, it seems to me that an ideal format for messages to and from the POST request could be SOAP. Am I missing some things about REST? I think I understand the GET side fine, but the POST I'm really not sure about...

Sun, 2005-Mar-27

A second stab at HTTP subscription

A few weeks back I put forward my first attempt at doing HTTP subscription for use in fast-moving REST environments. I wasn't altogether happy with it at the time, and the result was a bit half-arsed... but this is blogging after all.

I thought I'd give it another go, based on the same set of requirements as I used last time. This time I thought I'd try and benefit clients and servers that don't use subscription as well as those that do.

One of the major simplifying factors in HTTP has been the concept of a simple request-response pair. Subscription breaks this model by sending multiple responses for a single request. I have called this concept in the past a "multi-callback", which usually indicates a single response with a marker that it is not the last one the client should expect to receive. In it's original incarnation HTTP performed its exchange over a single TCP/IP connection, increasing overhead but again promoting simplicity. In HTTP/1.1 the default behaviour became to "hang on" to a connection and to allow pipelining (multiple requests sent before any response is received) to reduce dependence on a low-latency connection for high performance. Without pipelining it takes at least nxlatency to make n requests.

One restriction that can still affect HTTP pipelining performance is the requirement HTTP has to return all responses in the order requests were made. This may be fine when you're serving static data such as files, but if you are operating as a proxy to a legacy system you may have to make further requests to that system in order to fulfil the HTTP request. In the mean-time, other requests that could be made in parallel to the legacy system could be either backing up in the request pipe or could have been completed but are waiting on a particularly slow legacy system response to be returned via HTTP before they themselves can be.

This brings me to my first recommendation: Add a request id header. Specifically,

14.xx Request-Id

The Request-Id field is used both as a request-header and a response-header. When used as a request-header it provides a token the server SHOULD return in its response using the same header name. If a Request-Id is supplied by a client the client MUST be able to recieve the response out of order, and the server MAY return responses to identified requests out of order. Fairness algorithms SHOULD be used on the server to ensure every identified request is eventually dealt with.

A client MUST NOT reuse a Request-Id until its transaction with the server is complete. A SUBSCRIBE request MUST include a Request-Id field.

I've been tooing and froing over this next point. That is, how do we identify that a particular response is not the end of a transaction? Initially I said that 1xx series response code should be used, but that has its problems. For one, the current HTTP/1.1 standard says that 1xx series responses can't ever have bodies. That's maybe not the final nail in the coffin, but it doesn't help. The main reason I'm wavering from the point, though, is that a 1xx series response just aint very informative.

Consider the case where a resource is temporarily 404 (Not Found), but the server is still able to notify a client when it comes back into existence as a 200 (OK). The subscription should be able to convey that kind of information. I've therefore decided to reverse my previous decision and make the subscription indicate its non-completeness through a header. This has some precedent with the "Connection: close" header used to indicate a HTTP/1.1 server doesn't support pipelining.

Therefore, I would add something like the following:

14.xx Request

The Request general-header field allows the sender to specify options that are desired for a particular request over a specific connection and MUST NOT be communicated by proxies over further connections.

The Request header has the following grammar:

       Request = "Request" ":" 1#(request-token)
       request-token  = token

HTTP/1.1 proxies MUST parse the Request header field before a message is forwarded and, for each request-token in this field, remove any header field(s) from the message with the same name as the request-token. Request options are signaled by the presence of a request-token in the Request header field, not by any corresponding additional header field(s), since the additional header field may not be sent if there are no parameters associated with that request option.

Message headers listed in the Request header MUST NOT include end-to-end headers, such as Cache-Control.

HTTP/1.1 defines the "end" request option for the sender to signal that no further responses to the request will be sent after completion of the response. For example,

       Connection: end

in either the request or response header fields indicates that the SUBSCRIBE request transaction is complete. It acts both as a means for a server to indicate SUBSCRIBE transaction completion and for a client to indicate a subscription is no longer required.

A system receiving an HTTP/1.0 (or lower-version) message that includes a Request header MUST, for each request-token in this field, remove and ignore any header field(s) from the message with the same name as the request-token. This protects against mistaken forwarding of such header fields by pre-HTTP/1.1 proxies.

Benjamin

Sun, 2005-Mar-13

Where does rdf fit into broader architecture?

I've been pondering this question, lately. RDF is undoubtably novel and interesting. Its possibilities are cool.. but where and how will it be used if it is ever to escape the lab in a real way? What are its use cases?

I like to think that sysadmins are fairly level-headed people who don't tend to get carried away by the hype, so when I read this post by David Jericho I did so with interest. David wants to use RDF using the Kowari engine as a back end. In particular, he wants to use it as a kind of data warehousing application which he can draw graphs from and add ad hoc data to as time goes on. As Andrew Newman points out in a late 2004 blog entry, RDF when used in this way can be an agile database.

That doesn't seem to be RDF's main purpose. The extensive use of URIs gives it away pretty quickly that RDF is designed for the web, not the desktop or the database server. It's designed as essentially a hyperlinking system between web resources. RDF is not meant to actually contain anything more complicated than elementry data along with its named reversable hyperlinks. RDF schema takes this up a notch by stating equivalences and relationships between the predicate types (hyperlink types) so that more interesting relationships can be drawn up.

But how does this fit into the web? The web is a place of resources and hyperlinks already. The hyperlinks don't have a type. They always mean "look here for more detailed information". Where does RDF fit?

Is RDF meant to help users browse the web? Will a browser extract this information and build some kind of navigation toolbar to related resources? Wouldn't the site's author normally prefer to define that toolbar themselves? Will it help us to write web applications in a way that snippets of domain-specific XML wouldn't? Is it designed to help search engines infer knowledge about sites, that has no direct bearing on users use of the site itself? One suspects that technolgy that doesn't directly benefit the user will never be used.

So let's look back at the database side. The RDF Primer says "The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web." So maybe it isn't really meant to be part of the web at all. The web seems to have gotten along fine without it so far. Perhaps it's just a database about the web's resources. Such databases might be useful to share around, and in that respect it is useful to serialise them and put them onto the web but basically we're talking about rdf being consumed by data mining tools rather than by tools individuals use to browse the web. It's not about consuming a single RDF document and making sense of it. It's more about pulling a dozen RDF documents together and seeing what you can infer from the greater whole. Feed aggregators such as planet humbug might be the best example so far of how this might be useful to end users, although this use is also very application-specific and relies more on XML than RDF technology.

So, we're at the point were we understand that RDF is a good serialisation for databases about the web, or more specifically about resources. It's a good model for those things as well, or will be once the various competing query lanaguages finally coalesce. It has it's niche between the web and the regular database... but how good would it be at replacing traditional databases just like David wants to do?

You may be aware that I've been wanting to do this too. My vision isn't as grand in scope as "a database about the web". I just want "a database about the desktop", and "a database about my finances". What I would like to do is to put evolution's calendar data alongside my account data, and alongside my share price data. Then I'd like to mine all three for useful information. I'd like to be able to record important events and deadlines from the accounting system into the evolution data model. I'd like to be able to pull them both together and correlate a huge expense to that diary entry that says "rewire the house". RDF seems to be the ideal tool.

RDF will allow me to refer to web resources. Using relative URIs with an RDF/XML serialisation seems to allow me to refer to files on my desktop that are outside of the RDF document itself, although they might not be readily tranferrable. Using blank nodes or uri's underneath that of the RDF document's URI we can uniquely identify things like transactions and accounts (or perhaps its better to externalise those)... but what about the backing store? Also, how do we identify the type of a resource when we're not working through http to get the appropriate mime type? How does the rdf data relate to other user data? How does this relate to REST, and is REST a concept applicable to desktop applications just as it is applicable to the web?

In the desktop environment we have applications, some of which may or may not be running at various times. These applications manage files (resources) that are locally accessable to the user. The files can be backed up, copied, and moved around. That's what users do with files. The files themselves contain data. The data may be in the form of essentially opaque documents that only the original authoring application can read or write. They may be a in a common shared format. They may even be in a format representable as RDF, and thus accessable to an RDF knowledge engine. Maybe that's not necessary, with applications like Beagle that seem to be able to support a reasonable level of knowledge management without explicit need for data homogeny. Instead, Beagle uses drivers to pull various files apart and throw them in its index. Beagle is focused on text searching which is not really RDF's core concen... I wonder what will happen when those two worlds collide.

Anyway. I kind of see a desktop knowledge engine working the same way. Applications provide their data as files with well-known and easy to discern formats. A knowledge management engine has references to each of the files, and whenever it is asked for data pertaining to a file it first checks to see if its index is up to date. If so, it uses the cached data. If not, it purges all data associated with the file and replaces it with current file content. Alternatively, the knowledge manager becomes the primary store of that file and allows users to copy, backup, and modify the file in a similar way to that supported by the existing filesystem.

I think it does remain important for the data itself to be grouped into resources, and that it not be seen as outside the resources (files) model. Things just start to get very abstract when you have a pile of data pulled in from who-knows-where and are trying to infer knowledge from it. Which parts are reliable? Which are current? Does that "Your bank account balance is $xxx" statement refer to the current time, or is it old data? I think people understand files and can work with them. It think a file or collection paradigm is important. At the same time, I think it's important to be able to transcend file boundaries for query and possibly for update operations. After all, it's that transcending of file boundaries by defining a common data model that is really at the heart of what RDF is about.

<sigh>, I started to write about the sad state of triplestores available today. My comments were largely sniping from the position of not having an rdf sqlite equivalent. I'm not sure that's true after some further reading. It's clear that Java has the most mature infrastructure around for dealing with rdf, but it also seems that the query languages still haven't been agreed on and that there are still a number of different ways people are thinking about rdf and its uses.

Perhaps I'll write more later, but for now I think I'll just leave a few links lying around to resources I came across in my browsing:

I've been writing a bit of commercial Java code lately (in fact, my first line of Java was written about a month ago now). That's the reason I'm looking at Kowari again. I'm still not satisfied with the closed-source nature of existing java base technology. kaffe doesn't yet run Kowari, as it seems to be missing relevant nio features and can't even run the iTQL command-line given that it is missing swing components. I don't really want to work with something like Kowari until that is ironed out, but if I'm ever going to get back to writing my accounting app the first step will be to throw the current prototype out and start again with a more RDF-oriented back end. I'm concerned about the network-facing (rather than desktop-facing) nature of Kowari and am still not convinced that it will be appropraite for what I want. I would prefer a set of libraries that allow me to talk to the file system, rather than a set that allows me to talk to a server. Opening network sockets to talk to your own applications on a multi-user machine is asking for trouble, and not just security trouble.

Given what I've heard about Kowari, though, I'm willing to keep looking at it for a while longer. I've even installed the non-free java sdk on my debian PC for the purpose of research into this area. If the non-free status of the supporting infrastructure can be cleaned up, perhaps Kowari could still do something for me in my quest to build desktop knowledge infrastructure. On the other hand, I may still just head back to python and Redland with an sqlite back-end.

Benjamin

Sat, 2005-Mar-05

A RESTful subscription specification

Further to my previous entry on the subject, this blog entry documents a first cut at how I would update rfc2616 to support restful subscription. This is a quick hack update, and not a thorough workthrough of the original document.

The summary:

The detail:

  1. Add SUBSCRIBE and UNSUBSCRIBE methods to section 5.1.1
  2. Add SUBSCRIBE and UNSUBSCRIBE methods to the list of "Safe Methods" in 9.1.1
  3. Add section "9.10 SUBSCRIBE" with the following text:

    The SUBSCRIBE method means retrieve and subscribe to whatever information (in the form of an entity) is identified by the Request-URI. If the Request-URI refers to a data-producing process, it is the produced data which shall be returned as the entity in the response and not the source text of the process, unless that text happens to be the output of the process.

    A response to SUBSCRIBE SHOULD match the semantics of GET. In addition to the GET semantics, a successful subscription MUST establish a valid subscription. The subscription remains valid until an UNSUBSCRIBE request matching the successful SUBSCRIBE url is successfully made, until the server returns a 103 (SUBSCRIBE cancelled) response, or until the connection is terminated. A server with a VALID subscription SHOULD return changes using a 102 (SUBSCRIBE update) response to URL content immediately, but may delay responses according to flow control or server-side decisions about priority of subscription updates as compared to regular response messages. Whenever a 102 (SUBSCRIBE update) response is returned it SHOULD represent the most recent URL data state. Data MAY be returned as a difference between the current and previously-returned URL state if client and server can agree to do this out of band. A Updates-Missed header MAY be returned to indicate the number of undelivered subscription updates.

    A SUBSCRIBE request made to a URL for which a subscription is already valid SHOULD match the semantics of GET, but MUST not establish a new valid subscription.

    The response to a SUBSCRIBE request is cacheable if and only if the subscription is still valid. Updates to the subscription MUST either update the cache entry or cause the client to treat the cache entry as stale.

  4. Add section "9.11 UNSUBSCRIBE" with the following text:

    The UNSUBSCRIBE method means cancel a valid subscription. A server MUST set the state of the selected subscription to invalid. A client MUST either continue to process 102 (SUBSCRIBE update) responses for the URL as per a valid subscription, or ignore 102 (SUBSCRIBE update) responses. A successful unsubscription (one that marks a valid subscription invalid) SHOULD return a 200 (OK) response.

  5. Add section "10.1.3 102 SUBSCRIBE update" with the following text:

    A valid subscription existed at the time this response was generated on the server side, and the resource identified by the subscription URL may have a new value. The new value is returned as part of this response.

    This response should not be assumed to be associated with an in-sequence request, and may be returned when no request is outstanding.

  6. Add section "10.1.4 103 SUBSCRIBE cancelled" with the following text:

    A valid subscription existed at the time this response was generated on the server side, but the server is no longer able or willing to maintian the subscription. The subscription MUST be marked invalid on the client side.

    This response should not be assumed to be associated with an in-sequence request, and may be returned when no request is outstanding.

  7. Add section "14.48 Updates-Missed" with the following text:

    The Updates-Missed header MAY be included in 102 (SUBSCRIBE update) response messages. If included, it MUST contain a numeric count of the missed updates.

           Updates-Missed = "Updates-Missed" ":" 1*DIGIT
    

    An example is

           Updates-Missed: 34
    

    This response should not be assumed to be associated with an in-sequence request, and may be returned when no request is outstanding.

I guess the question to ask is whether or not subscription is a compatible concept with REST. I say it is. We're still using URLs and resources. We still have a limited number of request commands that can be used for a wide variety of media types. Essentially, all I'm doing is making a more efficient meta-refresh tag part of the protocol. It's part of what makes some of the proprietary protocols I've used in the past efficient and scalable. It's particularly something that helps servers deal gracefully with killer load conditions. You have a fixed number of clients making requests down a fixed number of TCP/IP connections. When the rate of requests goes up and the server starts to hurt, it simply increases the response time. No extra work is put on the system in times of stress. The server works as fast as it can, and any missed updates simply get recorded as such.

In a polling-based system things tend to degrade. You don't know whether the new TCP/IP connection is an old client or a new one, so you can't decide how to allocate resources between your clients. Even if they are using persistent connections, they keep asking you for changes when they haven't happened yet! If we're to see a highly-dynamic web I'm of the opinion that subscription needs to become part of it at some point.

Why not head for the WS-* stack? Well, I think the question answers itself... but in particular the separation of a client request and the connection it's using to maintain that subscription make the whole question about whether the subscription is still valid hard to assess. When it's not clear whether a subscription is up or down, time is wasted on both sides. My approach is simple and the broad conceptual framework has been proven in-house (although it hasn't been applied to http in-house just yet).

On another note, I was surpised to see the lack of high-performance http client interfaces available in Java. I was hoping to be able to make use of request pipelining to improve throughput where I'm gathering little pieces of data from a wide variety of URL sources on a display. There's just very little around that doesn't require a separate TCP/IP connection for each request, and usually a separate thread also. When you're talking about a thousand requests on a display, and possibly dozens of HMI's that want to do this simultaneously... well the server starts to look sick pretty quickly...

Benjamin

Sun, 2005-Feb-27

My wishlist RESTful subscription protocol

As part of the HMI prototyping work I've been doing lately I've also put together what I've called the "System Interface" prototype. Essentially, it's a restful way to get at the system's data.

It's probably going to be the most enduring part of any HMI development that actually takes place, as currently the only way to talk to our system is pretty much by linking against our C++ libraries. That's not impossible, but not straightforward for a Java HMI. If you want to write code from another perspective, you're pretty much toast.

So, what does a system interface look like? The principle I've used so far is REST. It keeps it simple, and leaves me in control of the vocabulary that is spoken back and forth between the system and any clients. My rough picture is that you do a HTTP GET (possibly with a query part in the URL) to access the current value of data, a HTTP PUT to set the value of some data, and a HTTP POST to ask the piece of data you've identified to do something for you.

What this doesn't give me is subscription.

When system data changes, it is important that operators are made aware of the change quickly so they can react to it. Our current system uses a proprietary protocol to perform this function, and I want to know whether any existing protocol is going to help me do this in a more standards-compliant way. If none does exist, then perhaps some hints on what I should use as a basis would be useful.

So... here is my model of how the protocol should work:

My list is a little proscriptive but is bourne out of experience. TCP/IP connections between client and server can be an expensive commodity once you get to a few hundred HMI stations, each asking a single server for subscriptions to a thousand values to display. They should be reused. The only thing I've missed is a keep-alive which probably should be able to be specified by the client end to ensure the current value is returned at least every n seconds. That way the client also knows that the server has gone silent.

One picture I have is that this extra protocol capability becomes a proprietary extension to HTTP, where a web browser can connect to our server and use the fallback "regular" GET operation. The response codes would have to be expanded. Each one must indicate the request they are associated with, and the traditional 200 series won't be sufficient for saying "This is a complete response you must process, but hang on the line and I'll give you further updates". Using the 100 series seems equally dodgy.

Another possibility is trying to get something going with the XMPP protocol, but that quickly heads into the land of "no web browser will ever go here". I really would like this to be back-wards compatible to HTTP, at least until web browsers commonly support retrevial of documents by more message-oriented protocols.

Benjamin

Sat, 2005-Feb-26

RDF encoding for UML?

Paul Gearon asks

Is there a standard for serialised UML into RDF?

The standard encoding for UML 2.0 is XMI, but rdf schema already does a nice job at modelling some equivalent concepts to those of UML. The w3c has this to say:

Web-based and non-Web based applications can be described by a number of schema specification mechanisms, including RDF-Schema. The RDF-Schema model itself is equivalent to a subset of the class model in UML.

Here is an attempt to overcome the limitations of XMI by mapping UML to rdf more generally than is supported by rdf schema. Since rdfs is a subset of UML, this is "similar to defining an alternative RDF Schema specification".

Benjamin

Sat, 2005-Feb-26

CM Synergy

Martin Pool has begun a venture into constructing a new version control system called bazaar-NG. At first glance I can't distinguish it from CVS, or the wide variety of CM tools that have come about recently. It has roughly the same set of commands, and similar concepts of how working files are updated and maintained.

This is not a criticism, in fact at first glance it looks like we could be seeing a nice refinement of the general concepts. Martin himself notes:

I don't know if there will end up being any truly novel ideas, but perhaps the combination and presentation will appeal.

To the end of hopefully contributing something useful to the mix, I thought I would describe the CM system I use at work. When we initially started using the product it was called Continuus Change Management (CCM), and has since been bought by Telelogic and rebadged as CM Synergy. Since our earliest use of the product it has been shipped with a capability called Distributed Change Management (DCM), which has since been rebadged Distributed CM Synergy.

Before I start, I should note that I have seen no CM synergy source code and have only user-level knowledge. On the other hand, my user level knowledge is pretty in-depth given that I was build manager for over a year before working in actual software development for my company's product (it's considered penance in my business ;). At that time Telelogic's predecessor-in-interest, Continuus, had not yet entered Australia and we were being supported by another firm. This firm was not very familiar with the product, and for many years the CCM expertise in my company exceeded that of the support firm in many areas. Some of my detailed knowledge may be out of date. I've been back in the software domain for a number of years.

CCM is built on an Informix database which contains objects, object attributes, and relationships. Above this level is the archive, which uses gzip to store object versions of binary types and a modified gnu rcs to store object versions of text types. Above this level is the cache, which contains extracted versions of all working-state objects and static (archived) objects in use within work areas. Working state objects only exist within the cache. The final level is the work area. Each user will have at least one, and that is where software is built. Under unix, the controlled files within the work area are usually symlinks to cache versions. Under windows, the controlled files must be copies. Object versions that are no longer in use can be removed from the cache by using an explicit cache clean command. A work area can be completely deleted at any time and recreated from cache and database information with the sync command. Atribitrary objects (including tasks and projects, which we'll get to shortly) can be transferred between CCM databases using the DCM object version transfer mechanism.

CCM is a task-based CM environment. That means that it distinguishes between the concept of a work area, and what is currently being worked on. The work area content is decided by the reconfigure activity which uses reconfigure properties on a project as its source data. A baseline project and a set of tasks to apply (including working state and "checked-in" (static) tasks). This set is usually determined by a set of task folders, which can be configured to match the content of arbitrary object queries.

Once the baseline project and set of tasks is determined by updating any folder content, the tasks themselves and the baseline project are examined. Each one is equivalent to a list of specific object versions. Starting at the root directory of the project, the most-recently-created version of that directory object within the task and baseline sets is selected. The directory itself specifies not object versions, but file-ids. The slots that these ids identify are filled out in the same way, by finding the most-recently-created version of the object within the task and baseline sets.

So, this allows you to be working on multiple tasks within the same work area. It allows you to pick up tasks that have been completed by other developers but not yet integrated into any baseline and include them in your work area for further changes. The final and perhaps most imporantant thing it allows you to do is perform a conflicts check.

The conflicts check is a more rigourous version of the reconfigure process. Instead of just selecting the most-recently-created object version for a particular slot, it actively searches the object history graph. This graph is maintained as "successor" relationships in the informix database. If the the graph analysis shows that any of the objects selected by the baseline or task set are not predecessors of the selected objects then a conflict is declared. The user typically resolves this conflict by performing a merge between the two selected but branch versions using a three-way diff tool. Conflicts are also declared if part of a task is included "accidentally" in a reconfigure. This can occur if you have a task A and task B where B builds on A. When B is included, but A is not included some of A's objects will be pulled into the reconfigure by virtue of being predecessors of "B" object versions. This is detected and the resolution is typically to either pull A in as well, or to remove B from the reconfigure properties.

The conflicts check is probably the most important feature of CCM from a user perspective. Not only can you see that someone else has clobbered the file you're working on, but you can see how it was clobbered and how you should fix it. On the other side, though, is the build manager perspective. Task-based CM makes the build manager role somewhat more flexible, if not actually easier.

The standard CCM model assumes you will have user work areas, an integration work area, and a software quality assurance work area. User work areas feed into integration on a continuous or daily basis, and every so often a cut of the integration work area is taken as a release candidate to be formally assessed in the slower-moving software quality assurance work area. Each fast moving work areas can use one of the slower-moving baselines as its baseline project (work area, baseline, and project are roughly interchangeable terms in CCM). Personally, I only used an SQA build within the last few months or weeks of a release. The means of delivering software to be tested by QA is usually a build, and you often don't need an explicit baseline to track what you gave them in earlier project phases.

One way we're using the CCM task and projects system at my place of employment is to delay integration of unreviewed changes. Review is probably the most useful method for validating design and code changes as they occur, whether it be document review or code review. Anything that hasn't been reviewed isn't worth its salt, yet. It certainly shouldn't be built on top of by other team members. So what we do is add an approved_by attribute to each task. While approved_by is None, it can be explicitly picked up by developers if they really need to build upon it before the review cycle is done... but it doesn't get into the integration build (it's excluded from the folder query). When review is done, the authority who accepts the change puts their name in the approved_by field, and either that person or the original developer does a final conflicts check and merge before the nightly build occurs. That means that work is not included until it is accepted, and not accepted until it passes the conflicts check (as well as other check such as developer testing rigour). In the mean-time other developers can work on it if they are prepared to have their own work depend on the acceptance of the earlier work. In fact, users can see and compare the content of all objects, even working state objects that have not yet been checked in. That's part of the beauty of the cache concept, and the idea of checking out objects (and having a new version number assigned to the new version) before working on them.

I should note a few final things before closing out this blog entry. Firstly, I do have to use a customised gnu make to ensure that changes to a work area symlink (ie, selection of a different file version) always cause a rebuild. It's only a one-line change, though. Also CCM is both a command-line utility and a graphical one. The graphical version makes merging an understanging of object histories much easier. There is also a set of java GUIs which I've never gotten around to trying. Telelogic's Change Synergy (a change request tracking system similar in scope to bugzilla) is designed to work with CCM, and should reach a reasonable level of capability in the next few months or years but is currently a bit snafued. Also, I haven't gone into any detail about the CCM object typing system or other aspects that there are probably better solutions to these days anyway. I also haven't covered project hierarchies, or controlled products which have a few interesting twists of their own.

Benjamin

Sat, 2005-Feb-19

Aliens don't use XML

According to Sean McGrath, aliens don't use XML. He says they have separate technology stacks for dealing with tabular data, written text data, and relational data. I wonder, then, what the aliens do when they want to mix their data? :)

Perhaps a non-XML alternative for data representation will reemerge at the cutting edge some time in the future, but the homogeny issues will still have to be addressed by this new definition. CSV++ would have to find a way to embed or uniformly refer to XHTML++ and N3++ data. XHTML++ and N3++ would need simllar embedding.

XML with namespaces looks like holding the top spot in being able to both define the structure and identify the correct interpretation of its content for the time being.

Sun, 2005-Feb-13

To disable Internet Explorer

The HUMBUG mailing lists have recently been a-buzz with talk of Suncorp-Metway's apparent pro-IE+Windows and anti-Mozilla+Unix stance in its online terms and conditions. These conditions were later clarified in a form that I hope will stand up in court but have personal doubts about with in this follow-up.

Greg Black linked to this article which contained a comment by Bill Godfrey:

I keep IE hanging around, but I have the proxy server set to 0.0.0.0 and I make exceptions in the no-proxy-for list.

I've now adopted this policy on my work machine, although I've set my proxy to localhost (127.0.0.1) instead of 0.0.0.0 as I consider this a touch safer. Since IE and Mozilla have distinct proxy settings this prevents IE and its variants from accessing remote sites while allowing me to roam freely under Firefox. This is particularly important for me because Lotus Notes explicitly embeds IE for its internal web browsing and has a habit of doing things like looking up IMG tags from spam. Hopefully this will put a stop to that.

Sat, 2005-Feb-12

XP and Test Reports

Adrian Sutton writes about wanting to use tests as specifications for new work to be done:

You need to plan the set of features to include in a release ahead of time so you wind up with a whole heap of new unit tests which are acting as this specification that at some point in the future you will make pass.  At the moment though, those tests are okay to fail.

It isn't crazy do want to do this. It's part of the eXtreme Programming (XP) model for rapid development. In that model, documentation is thrown away at the end of a unit of work or not produced at all. The focus is on making the code tell the implementation story and making the tests tell the specification story. What the XP model would call what you're trying to do is not unit testing, but acceptance testing:

Customers are responsible for verifying the correctness of the acceptance tests and reviewing test scores to decide which failed tests are of highest priority. Acceptance tests are also used as regression tests prior to a production release.

Implicit in this model is the test score. In my line of work we call this the test report, and it must be produced at least once per release but preferrably once per build. A simple test report might be "all known tests pass". A more complicated one would list the available tests and their pass/fail status.

Adrain continues,

The solution I'm trying out is to create a list of tests that are okay to have fail for whatever reason and a custom ant task that reads that list and the JUnit reports and fails the build if any of the tests that should pass didn't but lets it pass even if some of the to do tests fail.

If you start from the assumption that you'll produce a test report the problem of changes to the set of passed or failed tests can become a configuration management one. If you commit the status of your last test report and perform a diff with the built one during the make process you can break the build on any unexpected behaviour in the test passes and fails. In addition to ensuring only known changes occur to the report it is possible to track (and review) positive and negative impacts on the report numbers. All the developer has to do is check in a new version of the report to acknowledge the effect their changes have had (they're triggered to do this by the build breakage). Reports can be produced one per module (or one per Makefile) for a fine-grained approach. As a bonus you get to see exactly which tests were added/removed/broken/fixed at exactly which time, by whom, and who put their name against acceptance of the changed test report and associated code. You have a complete history.

This approach can also benefit other things that are generated during the build process. Keep a controlled version of your object dump schema and forever after no unexpected or accidental changes will occur to the schema. Keep a controlled version of your last test output and you can put an even finer grain on the usual pass/fail criteria (sometimes it's important to know two lines in your output have swapped positions).

Sat, 2005-Feb-12

On awk for tabualar data processing

My recent post on awk one liners raised a little more contraversy than I had intended.

It was a response to Bradley Marshall's earlier post on how to do the same things in perl, and lead him to respond:

I'm glad you can do it in awk - I never suspected you couldn't... I deal with a few operating systems at work, and they don't always have the same version of awk or sed installed... it was no easier or harder for me to read. Plus, with Perl I get the advantage of moving into more fully fledged scripts if I need to...

Byron Ellacott also piped up, with:

As shown by Brad and Fuzzy, you can do similar things with different tools that often serve similar purposes. So, here's the same one-liners using sed(1) and a bit of bash(1)... (Brad, I know you were just demonstrating an extra feature of perl for those who use it. :)

I knew that also, and perhaps should be been more explicit in describing the subtleties of why I responded in the first place. Firstly, I have a long and only semi-serious association with awk vs perl advocacy. My position was always that perl was filling a gap that didn't exist for my personal requirements. I seem to recall that several early jousts on the subject were with Brad.

To my mind, awk was desperately simple and suited most tabular data processing problems you could throw at it. My devil's advoate position was that anything too complicated to do in awk was also too complicated to do legibly in perl. Clearly the weight of actual perl users made this position shaky (if not untenable) but I stuck to my guns and for the entire time I was at university and for several years later I found no use for perl that couln't be more appropriately implemented in another way.

Perl has advanced over the years, and while I still have no love for perl as a language the existance of CPAN does make perl a real "killer app". Awk, with its lack of even basic "#include" functionality will never stack up to the range of capabilities available to perl. On the other hand, bigger and better solutions are again appearing in other domains such as python, .NET, JVM-based language implementations and the like. I've had to learn small amounts of perl for various reasons over the years (primarily for maintaining perl programs) but I'll still work principally in awk for the things awk is good at.

So, when I saw Brad's post I couldn't resist. The one-liners he presented were absolutely fundamental awk capabilities. They were the exact use case awk was developed for. To present them in perl is like telling a lisp programmer that you need to do highly recursive list handling, so you've chosen python. It's a resonable language choice, especially if you're already a user of that language. It's just that you have to push that language just a little harder to make it happen. It's not what the language was meant to do, it's just something the language can do.

I absoluately understand that you can do those things in other langauges. I sincerely sympathise with Brad's "Awk? Which version of awk was that, again?" problem. I don't believe everyone should be using awk.

On the other hand, if you were looking for a language to do exactly those things in I would be happy to guide you in awk's direction. Given all the alternatives I still maintain that for those exact use-cases awk is the language that is most suitable. As for Brad's "with Perl I get the advantage of moving into more fully fledged scripts" quip, awk is better for writing full-fledged scripts than most people assume. So long as your script fits within the awk use case (tabular data handling) you won't have to bend over backwards to make fairly complicated transformations fly with awk. If you step outside that use-case, for example you want to run a bunch of programs and behave differently based on their return codes... well awk can still do that, but it's no longer what awk is designed for.

Benjamin

Sun, 2005-Feb-06

Awk one-liners

I felt I had to respond to this post by Bradley Marshall on perl one-liners. My position as an awk advoate is a long-suffering one and one that could do with some updating :)

Given the following data set in a file:

foo:bar:baz

The following one-liner will pull it out in a useful fashion:

$ awk -F: '{print $2}' filea
bar

A neat extension is, given a dataset like:

1:2:3:4
5:6:7:8

You can use a one liner like the following:

$ awk -F: '{tot += $1};END{print tot}' fileb
6

I, for one, feel cleansed. Now wasn't that a touch easier and more legible? :) As always, see the awk manpage for more information...

Benjamin

Sat, 2005-Feb-05

The Common Development and Distribution License

Open solaris has been launched with the release of dtrace under the Common Development and Distribution License (CDDL), pronnounced "cuddle" (although I always think ciddle when I see it). CDDL is an OSI-approved license. This should be a good thing, and is.

This blog entry covers some of the scope of contraversy over CDDL, and sun's decision to create it and to use it. I'll be reporting on discussions within the Groklaw, HUMBUG, and Debian organisations but mostly I'll be linking to things I was interested to hear or things I agree with ;)

Groklaw was looking into the CDDL quite early in the peace. During December 2004 PJ wrote this article requesting feedback on the license as submitted to OSI for approval. After noting the license was not going to be GPL-compatible she put this message forward, front and center:

So what, you say? Other licenses are not (GPL-compatable) either. But the whole idea of Open Source is that it's, well, open. For GNU/Linux and Solaris to benefit each other, for example, they'd need to choose a licence that allows that cross-pollination. So Sun is letting us know that it is erecting a Keep Out sign as far as GNU/Linux is concerned with this license...

She goes on to quote Linus Torvalds who had earlier spoken to eWeek.com about CDDL to support her view.

By the 18th of December 2004 Sun had responded to Groklaw concerns regarding some parts of the license. PJ doesn't comment further on the pros and cons of the license at this time.

By the 26th of January 2005 OSI approval had been granted. It was time for PJ to get back on the bandwagon with this sort of sentiment:

Yes, they are freeing up 1600 patents, but not for Linux, not for the GPL world. I'm a GPL girl myself. So it's hard for me to write about this story on its own terms. I am also hindered by the fact that I've yet to meet a Sun employee I didn't like personally. But, despite being pulled confusingly in both those directions at once, in the end, I have to be truthful. And the truth is Sun is now competing with Linux. That's not the same as trying to kill it, but it's not altogether friendly either. Yet, at the teleconference, Sun said they want to be a better friend to the community. I feel a bit like a mom whose toddler has written "I LUV MOMMY" on the wall with crayons. Now what do I say?

A further 28th of January article highlights another possible technical issue with the CDDL arrangment, but expects the problem will be solved when the Contributors Agreement is drawn up.

After reading the groklaw articles I had a number of things to think about and wrote three emails to the humbug general mailing list. The first just pointed to the 26th of January 2005 Groklaw article. The second two were a bit more exploriatory of the problem of GPL-incompatiblity, and of what it means to create a new open source license. On the 30th of January 2005, I wrote:

I, for one, am not surprised by this release initially being under the CDDL only. It does seem like a reasonable license given the circumstances, just as the MPL did in the early days of mozilla. I think (and hope) that over time the open source experiment will prove beneficial to all parties and that dual-licensing under the GPL or LGPL will one day be possible. It does seem unlikely that the GPL camp will move too far from its position regarding compatibility after all this time. As the newcomer to open source, sun.com will eventually have to expose itself to the GPL if it is to maximise its community support and exposure. Eventually, I hope that this open source experiment leads to benefits to open source operating system development everywhere.

After some interesting followup by a resident Sun employee (who was not representing Sun in the conversation), I wrote a more concrete exploration piece covering the topics of whether opening Solaris would benefit Linux and of general open source license compatability. I wrote:

In order to make CDDL and GPL compatible we have to look at both directions of travel. CDDL->GPL could be achieved by dual licensing of the software or dropping the CDDL in favour of something like LGPL or newBSD. Both options are probably unacceptable to Sun who wrote this license in order to protect itself and do other lawyerly things. On the flipside, GPL->CDDL is equally hair-raising. Linux code would similarly have to dual-license or use a weaker license. Would the CDDL terms be an acceptable release condition for Linux kernel software? Probably not, because CDDL allows combinations with closed-source code. That would allow the linux kernel to be combined with closed-source code also. The two licenses exist under two different ideologies and two different commercial realities. The licenses are reflective of that.

and this:

I suspect developers need to be careful about looking at Solaris code with a view to repeating what they've seen in Linux, and likewise linux developers may have to be careful about what they would contribute to OpenSolaris. Sure, it's fine to recontribute your own work but if you're you're copying someone else's work there may be issues. Like closed-source software, open-source software is not public domain and can't be copied from one place to the other without reference the the relevant licence agreements. When Microsoft code has leaked onto the internet in days past I seem to recall fairly strong statements that anyone who had seen such code would not be able to work on the same kinds of features in Linux as they had seen in the leaked code. There's too much of a risk that subconscious copying (or the perception of it) could lead to future legal difficulties.
...
Still, even if no copying or cross-pollination can occur at the code level the open sourcing of Solaris should bring the developers closer at the collaborative community level. From that perspective even with the GPL/OSI fracture we should all see some benefits from Sun's undeniably generous and positive actions.

More recently, I've read up on what debian legal had to say about CDDL. Mostly it was "We shouldn't be spending any time thinking about this until someone submits CDDL code for inclusion into Debian" but some interesting opinions did come up. Personally, I trust debian's take on what free software is more than I trust OSI's take on what open source software is. Despite striking similarity between Debian's Free Software Guidelines and OSI's Open Source Definition (the OSD is based on the DFSG) the two organisations seem to put different emphasis on who they serve. Debian appears very conservative in making sure that user and redistributor freedoms are preserved. I've never quite worked out who's freedoms OSI is intended to preserve, but I believe they have a more business-oriented slant. From my reading it seems that Debian's (notional) list of acceptable licenses is shorter than OSI's (official) list.

Two threads appears on the debian-legal mailing list. One commented on the draft license, and the other on the OSI-approved license. I think the most pertinant entry from the former thread was this one, by Juhapekka Tolvanen which states:

It probably fails the Chinese Dissident test, but I don't think that's a problem. The requirement to not modify "descriptive text" that provides attributions /may/ be a problem, but that'll depend on specific code rather than being a general problem...

Andrew Suffield elaborates, saying:

Is that license free according to DFSG?
Not intrinsically. Individual applications of it may be, with a liberal interpretation, or may not be, with a lawyer one. Notably it's capable of failing the Chinese Dissident test, and of containing a choice-of-venue provision. It also has a number of weasel-worded lawyer clauses that could be used in nasty ways...
Yeah, it's another of those irritating buggers. We'll have to analyse each license declaration that invokes this thing.

Followups in the later thread reinforce that none of the problems debian-legal had with the orignal draft appears to have shifted.

To close out this entry I'd like to bring the sagely words of Stuart Yeates from debian-legal to bear:

The CDDL is almost certainly better from pretty much every point of view (including that of the DFSG) than the current licences for Solaris. If you had ethical no problems with the old licences for Solaris, you're unlikely to have ethical problems with the CDDL.

As for the free software world's general acceptance of and participation in the CDDL, it is probably no worse than the Mozilla Public Licenese or any number of other licenses that have appeared over time and been declared open source. Personally I won't be trusting any license that Debian doesn't support, but we won't find out whether that test is passed for quite some time yet (unless someone wants to try a dtrace port...)

New licenses are created because the developer can't accept the protections, guarantees, and restrictions of existing licenses under which to release their code. Lawyers who write these licenses deliberately make their license incompatable with other licenses in order to prevent their code being distrubted under such unacceptable terms. In doing so, they prevent cooperation at the source level between them and anyone else. They create a barrier.

If I were Sun I'd want to be pretty damn sure that other people had the same view about existing licenses and saw their license as the perfect alternative before shutting out so much of the existing developer community. Regardless of your attitudes about what represents open source or free software, these barriers are not good. Every time a new license is written, somewhere a fairy dies. Please, think of the fairies.

Benjamin

Sat, 2005-Jan-29

XML. XForms. Schema.

Over the last month or so my collegues and I have been going through a small modernisation process at work. This involves picking up a few new pieces of software that themselves have no dependencies and including them into our software base (and our safety cases). Much of the work is still at a prototype stage, and the goals are varied (one is the production of a new HMI).

We don't pick up new technology very often, and whenever we do it tends to be what others might consider old technology. I spent the last two days adding a custom plugin interface to thttpd in order to provide RESTful (polling-based) XML data from our system without resorting to using CGI or third-party software containing more than a few thousand lines of C-equivalent code. If this part of the prototype pans out it should free me up in selecting solutions for the actual HMI side (it's always easier to say "use HTTP" instead of "link against this library, make sure you call the right init functions...."). As to whether it will pan out, that's a little up in the air. Even if it doesn't, it's gratifying to get SCADA and train control data into a form you can view easily in mozilla. A little XInclude and XSLT should be all that's required to put together a web page for simple management functions. Performance is reasonable, too. My initial CGI hack took .19 seconds for wget to retrieve a single datum. The plugin interface does it in approximately .01 seconds.

XML is making my life easier. We've already had an arms-length association with gnome's libxml2 and libxslt through the xsltproc program we use to generate some of our source code from XSLT templates. The relationship was close enough to require coverage in our last safety case, the success of which adds weight for its inclusion in our software proper. I've also been looking a little higher up the tool-chain and am definately considering the use of SVG to define shapes on schematic displays.

XForms is also a technology that's piqued my interest. Byron breifly compared and contrasted XForms 1.0 and WebForms 2.0 in a recent blog entry from the perspective of a web content provider. He mentions the promised inclusion of XForms in Mozilla, but also of interest to me is the recent announcement of inclusion in openoffice 2.0. These are positive steps for the standard and seem to imply that it is not too complicated to be usefully implemented.

In my business, though, I think the best I can hope for is to develop a forms capability "inspired by" XForms. There's no way I could pull in the mozilla technology stack as part of my HMI. The connection between XML Schema and XForms puts it even further from my roadmap with libxml2 only supporting XML Schema in limited ways. It seems that outside the windows world and the rarified air of w3.com many prefer Relax NG.

XForms itself obviously wouldn't solve my HMI problem. The problem is not well-defined, but does include aspects of forms entry, list handling, schematic (and iconic) data representations, and fast (non-polling subscription) updates. That combined with infinite configurability for project needs starts to look a little daunting if you stare it stright in the eye. The development of the HMI will continue to be an interesting and poorly-defined problem for the forseeable future with pressures and expectations beyond our capabilities, but I'm sure I'll get there with something useful in the end.

Benjamin

Wed, 2005-Jan-26

Dead Code

I just read this entry from one of the planet freedesktop bloggers.

It reminds me of an email sent around my place of work with a title something like "Can anyone tell me what this function does?". It came from real source code in our system at the time and contained a function at least a few hundred lines long. The very first line of the function was an unconditional "return 0;". Just as the author had, most readers spent a few minutes delving through the complex control flow of the function before looking at the code in in a linear fashion and seeing that first "the function does nothing" line.

Dead code could very well be the worst sin in software. I you want to keep past code around for future reference, use a revision control system. Don't clutter up the current codebase with historical oddities, especially when they appear at first to still be used.

Sun, 2005-Jan-09

TransactionSafe turns 0.4

After more than four months of very occasional half-hearted hacking TransactionSafe has reached version 0.4. This version is titled "Basic Register" or alternatively "I was too embarrassed to release yet another version which still didn't support basic transaction entry!".

This version includes a few interesting design points, although they may be obscured by a fairly rushed and stream-of-conciousness flow of python programming. As with previous versions I've based the user Load/Save paradigm around a commit-as-soon-as-you-can model. There is no save button because the file is always saved. Pity there is no undo. Well, that can come later ;)

Another feature is my attempt to use "processes" sort-of-modelled after the Erlang message passing system. Essentially, each object gets told when to execute by an evalaute scheduler. At that time their input is already there (as it has been passsed on by another process) and ready to be processed into output. I think I've been able to achieve a reasonable amount of decoupling between components using this model, although it is still highly-experimental (meaning non-obvious and hard to read).

Third on my list of novelties is the way the program interacts with the sqlite database. The database itself (with some transformations applied) is considered the model in my MVC, and I have walked a path of updating everything using the same inefficient simple algorithm rather than custom-coding update features. You can see this in the diference between the account tree (which is based on the micro-update model) and the register (which is the newer code based on a single update mechanism). The round-trip to the database occurs often (every time a single field is updated), and I even went so far as to use the python difflib to determine which parts of the gtk tree model that makes up my register to update. That is to say, the gtk tree model likes to be updated minimally so that the user is still in the same place they were in the tree before the update. I've taken the raw tree model from disk each time, and passed it through a diff algorithm filter to determine what that minimal update is without the need for custom code for inferring.

This version makes use of triggers for database updates, particularly to maintain certain database invariants such as "all transactions balance". I've waited this long to do it because for a long time I wasn't sure exactly how I was going to enforce that constraint. Now I maintain a Balance account that must always exist as transactionEntryId 0, and is referred to in every transaction at least once (multiple currencies each get their own balance entry). This meshes with the current user interface/database interaction paradigm nicely.

Now that the triggers are in place, it should be possible for other applications to touch the database without too much risk of trashing the underlying conceptual model. This is core to my intentions for this work and is a step up from the gnucash line of a closed file format under a open (but subject-to-change) C api deep inside the gnucash codebase.

This version is still based on a "core" financial data model. As yet it doesn't even include a default currency for accounts, let alone account types that might direct the register to call the Debit and Credit columsn something friendlier. I'm hoping that I'll get around to designing and demonstrating the use of multiple rdf schemas in the same SQL database at some point to allow expansion for various application needs (share price download could use another schema integrated into the same database, for example).

In the end this version is significant because it represents my second motivation point in the development of this software. My current list is:

  1. Actually contribute some open source software, instead of just bitching (complete, v0.1)
  2. Try not to be completely full of hot air by getting transaction entry working (complete, v0.4)
  3. Make this program a usable alternative to Quicken for my own finances (todo)
  4. Establish a small user-base (todo)
  5. Establish a developer community (todo)
  6. Profit! (todo. damn.)

Hopefully the energy required to overcome these peaks of resistence will not be too great :) At present I'm still just a little pissant who can't manage to play with any of the big boys and who has therefore unreasonbly set out on his own. The best I could hope for at present is to fracture an already anemic developer and user home and small business financial software community. <sigh/>

Benjamin.