Sound advice

I've just come across this May article by Michael Ellerman. Michael responds to some inflammatory language I used in my own response to a Joel Spolsky article. I obtusely referred to "real work" and my doubt that python is suited. Both my article and Michael's are centred around the concepts of type and of Design By Contract. I have since written about type in the context of the REST architectural style.

Compile time or runtime?

My feelings are still mixed about the use of explict type hierarchies. On one hand I have years of experience in various languages that show me how type can be used by the compiler to detect bugs even when no test code has been written, or when the test code doesn't achieve 100% coverage of the input space. That capability to detect error is important because coverage is theoretically impossible for most functions. The gap between what is tested and what is possible must be covered to give any confidence that the code itself is correct. Stop-gaps have traditionally come either in the form of manual code review or automatic type-supported compiler checks. On the other hand, it is clear that even a carefully-constructed type hierarchy isn't capable of detecting all contract breeches at compile time. Function preconditions must still be asserted, or functions must be redefined to operate even in cases of peculiar input. The assertions are run time checks in languages I'm familar with, and it may not be possible to convert all of them to compile time checks.

Valid and invalid function domains

It should always be possible to define the valid input space to a function. Proving that the actual input is a member of the set of valid input is typically harder. Type hierarchies are a tool to tell the compiler that the appropriate reasoning has been done by the time it reaches the function. Unfortunately, the complexity of a type system that could effectively define the domain of every function in your program is so great that implementing things this way may cripple your programming style. You would definately be taking a step away from traditional procedural approaches and moving firmly towards a mathematical or functional view of your application. These world views have traditionally been more difficult for developers to grasp and work effectively in, except perhaps in the especially clever among us.

The other thorn in the side of type systems is actual input to the program. It typically comes from a user, or perhaps a client application. The process of converting the data from raw bytes or keystrokes into useful input to a complex typing system can be quite tricky. Your program has to examine the data to determine which type best reflects its structure, and this must take into account both what it is and how it will eventually be used. It is trivially provable that any function that accepts unconstrained input and tries to construct objects who's types match the preconditions of another function must be runtime checked for as far as the unconstraned input is a superset of the legal constrained input. Because it is important to know how data will be used before using it these runtime-checked functions are likely to appear throughout your actual program. The more I/O connected your program is, the more you will have to deal with runtime checking.

Bucking the type hierarchy

I believe that REST and a number of other approaches which reject planned type hierarchies are founded in this area where handling of data from outside the program is difficult or impossible to isolate to a small part of the program's code. When this is the case the use of formal typing may not add any value. Just apply the same paradigm everywhere: Deal with a very basic fundamantal type and allow the data to be rejected if required. Handle the rejection in a systematic way. To me, the use of exceptions is a hint that this approach was brewing in headline Object-Oriented languages long before the pythons and RESTs of this world found a mass appeal. Exceptions are thrown when input to a function is not with the function's domain, but when that condition is not necessarily an indication of error within the program. If the error occured because of faulty input outside the program's control it makes sense to throw and catch exceptions so that the original input can be rejected at its source. The alternative would be to deal with the error wherever it occured or to specifically structure your program to include error-handling modules or classes. Within the classical strongly-typed O-O system without concerns for external input exceptions are evil, but when used to handle faulty external input consistently they can be a force for good.

The SOAP Intervention

Like CORBA, SOAP attempts to extend the Object-Oriented programming model across programs. This allows servers to treat input data not as streams of bytes but as fully formed structures and object references with their place in the type hierarchy already assigned. Various attempts have come and gone to actually serialise and deserialise objects between processes. These approaches should make it possible to eliminate significant network-facing handling of very basic types. Now, if only we could make the user start entering serialised objects as well... ;)

These approaches are interesting. They fail to some extent because of the assumptions they tend to make about network latency being small. They fail somewhat because they don't adequately support concepts of caching and of redirecting requests. On the other hand, I think there is some evidence these days that they fail simply because they have a system formal type hierarchy. If a programming language is like any other program, it benefits from being able to do what the user wants without the user having to understand too many concepts. These systems are built around the idea that the large organised way we have written software in the past is the correct one, and that all we need to do is expose this programming model on a larger scale. I think it was the success of Visual Basic that first saw this world view really shaken. It turns out that when you allow the average bloke off the street to write software the software he writes is pretty simple. It may have less stringent performance and safety requirements to what we professionals believe we have. It may be smaller and less sophisticated than we tend to come up with. On the other hand, it works. It does a job that that guy certainly wouldn't have hired us to do for him. It scratches an itch and empowers him. I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.I think there's some evidence that more software will be written in the future at the end of the scale than at the end we're used to working from. In fact, I suspect that one day we'll all be writing software using the same mental models that Joe Bloggs works with and we'll find that we can still do the good job we think we're doing now using those approaches.

Conclusion

Michael proposes the use of assert isinstance(s, SafeString) to do type checking in python. I'm not sure this is really "good python", which normally focuses on what an object can do rather than what type you've been able to tag it with, but that's an aside. This isn't as useful as the type checking in C++ or Java because it is performed at runtime and only triggers when the input is not what was expected. He points out that C++ does no better on more complex contract provisions such as "i < 3", which must be made a precondition. My original point with respect to this issue is really about how far testing can take you on the path to program correctness. Using hungarian notation as Joel Spolski originally suggested may help a code reviwer determine program correctness, but in the python world there is no automatic genie to help you pick up those last few percentage of missed errors. In a large and complex program with incomplete test coverage my experience says that suitable compile-time type checking can find errors that code reviwers miss. On the other hand, if you're dealing with a lot of external input (and let's face it, we all should be these days) type may just not help you out at all. My feeling is that formal type hierarchies will eventually go the way of the cathedral as we just stop writing so much internally-facing code. My feeling is that we'll be much more interested in pattern matching on the input we do receive. Given python's use profile it is probably appropriate that input outside of function domains throws an exception rather than raising a runtime check, but in the domains I'm used to working in it doesn't seem that python is fundamentally different enough to or more advanced than the langauges we've seen in the past. Not so far advanced yet to warrant or make up for such a scaling back of tool support for compile-time checking. On the other hand I keep talking up REST concepts and the death of type to my own collegues. I guess you'd say that right now I'm still in the middle, and want to see more return for giving up my comfortable chair.

Benjamin

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My recent bookmarks

Types and Contracts

Compile time or runtime?

Valid and invalid function domains

Bucking the type hierarchy

The SOAP Intervention

Conclusion

Sound advice - blog

Lifesigns

Subscribe

RDF

Feedback and Social Software

Support Software

Site Statistics

License

My current feeds

My recent bookmarks

Types and Contracts

Compile time or runtime?

Valid and invalid function domains

Bucking the type hierarchy

The SOAP Intervention

Conclusion