In my
last column
I wrote about deliberately architecting your code in such a way that the
compiler makes your code more likely to be correct. I used an example lifted
from Joel Spolksy regarding "safe" and "unsafe" strings. In indicated they
should be made different types so the that the compiler is aware of and can
verify that the rules associated with assining one to the other or using one
in place of the other are followed. I thought I'd bring up a few more examples
of coding conventions I follow to use the compiler to best effect.
Don't initialise variables unless you have a real value for them
Junior programmers are often tempted to initialise every variable that they
declare, and do so immediately. They do so to avoid messy compiler warnings,
or because they think they are making the code cleaner and safer by providing
a value where there would otherwise be no value.
That's a bad idea. Whether it be initialising an integer to zero or a
pointer to null whenever you give a value to a variable you are lying to your
tools. You're giving a value to the variable that no use to your program, but
at the same time you are quelling the concerns of both your compiler and any
memory analyser you might be using (such as
purify).
Don't declare variables until you have a real value for them
This rule answers the question about how you deal with the warnings that
would come from not assigning values to your variables. The simple pseudo-code
you should always follow when declaring a variable is either one of:
MyType myVariable(myValue);
or
MyType myVariable;
if (condition)
{
myVariable = myValue1;
}
else
{
myVariable = myValue2;
}
Any other form is pretty-well flawed. In the first case, the variable is always
assigned to a useful value, and it happens immediately. In the second case,
every branch must make a concious decision about what to set your variable to.
Sometimes variables come in pairs, or other tuples that need to be initialised
together. Sometimes your conditions and branching are much more complicated
than in this case. By not initalising the varaible up front and initialising
in every branch you allow the compiler to find any cases you might have missed.
If you set an initial value for the variable before a conditional that may
assign values to it you stop the compiler complaining about branches where the
variable is not assigned to, and you allow the introduction of bugs through
unconsidered branches.
Always declare your else branch
Every "if" has an implicit "else". You need to consider what happens in that
case. The variable assignment structure I've laid out forces you to declare
your else branch(es) with the content you might otherwise have placed in the
variable's declaration. This is a good thing. Considering all branches is a
critial part of writing good code, and it is better to be in the habit of doing
it than in the habit of not. Even if you don't have content for one side of
the branch or the other, place a comment to that effect in the branch. Explain
why. This point won't help the compiler find bugs in your code, but it is good
to keep in mind.
Only assign to "external" variables once per loop iteration
This is a bit hard to write down as a hard and fast rule, but it is another
one of those variable initialisation issues. Sometimes you'll have a loop
which changes a variable declared outside of the loop. You'll be tempted to
add one or subtract one at various points in the loop, but it's the wrong
approach if you want help from the compiler. Instead, declare a variable which
embodies the effect on the external variable. Use the same technique as I've
outlined above to allow the compiler to check you've considered all cases.
At the end of the conditionals that make up the body of your loop, apply the
effect by assignment, addition, or whatever operation you require.
string::size_type pos;
do
{
string::size_type comma(myString.find(',',pos));
string::size_type newpos;
if (comma == string::npos)
{
// None found
newpos = string::npos;
}
else
{
myString[comma] = ';';
newpos = comma+1;
}
pos = newpos;
}
while (pos != string::npos);
On the subject of loops...
Use a single exit point for loops and functions
When you have a single exit point, your code is more symmetrical and that tends
to support the treatement of variables and cases that I'm suggesting. What
you're looking for is...
- To make sure you clearly enumerate and consider every case your function
has to deal with, and
- To give the compiler every opportunity to spot effects you may not have
specified for each case
It is these principles at the core of my general mistrust of exceptions as a
general unusual case-handling mechanism. I prefer explicit handling of those
cases. I won't go over my position on exceptions again just yet... suffice as
to say they're certianly evil under C++ and I'm still forming an opinion as
to whether the compiler-support exceptions recieve under Java is sufficient
to make them non-evil in that context :). Give me a few more years to make up
my mind.
Under C++, I consider the following loop constructs to be useful:
Pre-tested loops
while (condition) { /* body */ }
for (
iterator it=begin();
it!=end();
++it
) { /* body */ }
for (
iterator it=begin();
it!=end() && other-condition;
++it
) { /* body */ }
Mid-tested loops
for (;;) { /* body */; if (condition) break; /* body*/ }
Post-testd loops
do { /* body */ } while (condition);
for (;;) { /* body */; if (loop-scoped-condition) break; }
Don't use default branches in your switch statements
This goes back to general principle of making all your cases explicit. When
you change the list of conditions (ie. the enumeration behind a switch
statement) the compiler will be able to warn you of cases you haven't covered.
This isn't the case when you have a default. In those cases you'll have to find
a way to search for each switch statement based on the enumeration and assess
whether or not to create a new branch for it.
I'm happy for multiple enumeration value to fold down to a single piece of
code, for example:
switch (myVariable)
{
case A:
DoSomething1();
break;
case B:
DoSomething2();
break;
case C:
case D:
case E:
case F:
// Do nothing!
break;
}
When you add "G" to the condition the compiler will prompt you with a warning
that you need to consider that case for this switch statement.
You have a typing and classing system, so use it!
This is the point of my previous entry. Just because two things have the
same bit representation doesn't mean they're compatible. Declare different
types for the two and state exactly what operations you can do to each. Don't
get caught assigning a 64-bit integer into a 32-bit slot without range checking
and explicit code ok-ing the conversion. Wrap each up in a class that stops
you doing stupid things with them.
Use class inheritence sparingly
By class inheritence I mean inheritence of class B from class A, where A
has some code. In Java this is called extension. It is usually safe to say
that class B inherits interface A, but not class A. When you inherit from a
class instead of an interface you are aliasing two concepts:
- What you can do with it
- How it works
Sharing what you can do with an object between multiple classes is the
cornerstone of Object-Oriented design. Sharing how classes work is something
I don't think we have a syntactically simple solution to, even after all these
years. Class inheritence is one idea that requires little typing, but it leads
to finding out late in the piece that what you want to do with two classes is
the same, but their implementation needs to be more different than you expected.
At that point it takes a lot more typing to fix the problem.
So declare interfaces and inherit from interfaces where possible instead of
from classes. Contain objects, and delegate your functions to them explicitly
ahead of inheriting from ther classes. Sometimes class inheritence will be the
right thing, but in my experience that's more like five cases out of every
hundred.
I've strayed off the compiler-bug-detection path a little, so I'll finish
up with two points.
Use the maximum useful warning settings
They're there for a reason
Treat warnings as errors
There's nothing worse than seeing 100 warnings when you compile a single
piece of code, and have to try and judge whether one of them is due to your
change or not. Warnings become useless if they are not addressed immediately,
and they are your compiler's best tool for telling you you've missed something.
Don't think you're smarter than the machine. Don't write clever code. Dumb your
code down to a point the compiler really does understand what you're writing.
It will help you. It'll will help other coders. Think about it.
Benjamin