Perhaps the most important media type in an enterprise-scale or world-scale semantic web or REST architecture is text/plain. The text/plain type is essentially schema free, and allows a representation to be retrieved or PUT with little to no jargon or domain-specific knowledge required by server or client. It is applicable to a wide range of problems and contexts, and is easily consumed by tools and humans alike.
Uses of text/plain
In essence, this type conveys a string. However, we can also think about embedding numbers or other simple data types. The modern dynamic language approach to looking at strings is to allow implicit conversion between the information inserted by the sender and the type expected by the consumer. These values can easily be incorporated into programming language data types, inserted into databases, spreadsheets, reports, or other structures.
To outline a few potential uses of text/plain, consider the following interactions
- A client samples http://dod.example.com/defcon, retrieving "4". The defense readiness condition is at four. The client puts a value of "3", attempting to raise the condition... but this is either not implemented or requires authentication and authorization checks. Checks may also be in place to avoid movements of more than one, or multiple movements within a defined period of time.
- The CPU and memory usage of a server are sampled periodically, and the values returned as numeric text/plain.
- The balance of each account in an accounting system is available as a numeric value for incorporating into global profit and loss report.
- A spreadsheet service models the performance of a business process. Individual cells are linked in real-time to resources that provide input to its calculations. The report itself exposes calculation results that can be utilised in other spreadsheets.
- The time remaining until an important event occurs can be sampled by a splash of javascript in order to provide a useful countdown timer. This could be in the form of a period remaining that is sampled on an interval, or an absolute time that relies on client and serve clock synchronisation to provide an accurate timer.
Standards and compatibility
While formatting of numbers and other types may seem natural enough, it is important that this be done consistently if the information is to remain legible when it is processed. To my mind the best resource in formatting and processing of simple text-compatible data types can be found in the specification for XML Schema. Part 2 contains a section on built-in datatypes that covers a range of string, numeric, URI, date and time, and other simple types. Any data that can be formatted according to the rules in this section absolutely should be.
However, this leads to a dilemma. What do we do with types that are not found in this set? Should a geo-location become a structured XML document, or should it too be coded as text/plain? rfc2426 defines a semi-colon-separated standard format for geo-location, which could certainly be coded as text/plain. However, it is not clear at this stage that this is or will be the canonical way of encoding this information as a text/plain document. Without reference to applicable and universal standards we bear a significant risk that the partially-formatted content we transfer will in fact not be understood.
Applicability of text/plain MIME type
Part of the problem that emerges is that text/plain is not specific enough. It doesn't have sub-types that are clearly tied to a specification document or standards body. This makes interoperability a potential nightmare of heuristic detection.
Unfortunately, while XSD provides an excellent catalogue of basic types it is neither comprehensive nor sufficiently connected to MIME usage. Another problem with using text/plain in its bare form is its default assumption of a US-ASCII character type. This can lead to obvious problems in a modern internationalised world.
Without being backed by some kind of standards body, the advice I give in this regard is merely that. Standards may emerge later that contradict what I have to say here. That said, my advice is this:
- Treat text/plain content as being formatted according to XSD conventions when you recieve it. Take care to process character encoding directives correctly and support at least a utf-8 encoding.
- Consider using a text/xsd+plain document type when transmitting XSD-formatted simple content. This will hopefully indicate that the document can be understood as text/plain, but provide additional context if more complex processing is applied to the document.
- Make use of other specialised types that indicate the standard being applied when types outside of the XSD set are employed. For example, the geo coordinates above might be described as text/vcard+plain.
Again, ideally we would be making use of a well-defined standards body to own and maintain the media types used to communicate very basic information. Making up your own can only take the state of the art so far. However, standards sometimes emerge out of common best practice... so it is not a complete waste of time to be heading down this particular path.
When not to use text/plain
It should be clear that text/plain is not a tool for every occasion. It is often important to sample or send an atomic set of data that would require additional schema. Plain text when overused can lead to performance problems as individual values are sampled one by one instead of as a consistent and coherent document.
Perhaps the clearest indication that you are overusing text/plain is that you are experiencing an explosion in hyperlinks. When you start to need a document to provide links for consumers to find these text/plain-centric resources, you should probably consider incorporating the information directly into these documents themselves.
Used appropriately to transfer information to and from well-known and stable resources, text/plain or its variants can be an efficient way to communicate simple data without introducing unnecessary jargon. The URI of the resource and the implementation of client and server will provide sufficient context to format and process these simple data types.
The low barrier to entry to these types makes them universally applicable and easy to work with, however the lack of standardisation around matching encodings to media types is an inhibitor to their potential uptake. Used well, especially in combination with link headers and/or text/uri-list these types can provide an effective to way to make your protocols get out of the way of communication and let clients and servers interoperate with minimal complexity for simple use cases.
Benjamin

