I'm sure that Adrian Sutton didn't mean to imply in
this blog entry that
less code duplication implied better design, but it was a trigger for me
to start writing this rant that has been coming for some time.
Code duplication has obvious problems in software. Code that is duplicated
has to be written twice, and maintained twice. Cut-and-paste errors can lead
to bugs being duplicated several times but only being fixed once. The same
pattern tends to reduce developer checking and reading as they cut-and-paste
"known working" code that may not apply as well as they thought to their own
situation. Of all the maintainence issues that duplicated code can cause,
though, your first thought on seeing duplication or starting to add duplication
should not be "I'll put this in a common place". No! Stop! Evil! Bad!
Common code suffers from its own problems. When you share a code snippet
without encapsulating in a unit of code that has a common meaning you can cause
shears to occur in your code as that meaning or parts of it change. Common code
that changes can have unforseen rammifications if its actual usage is not well
understood throughout your software. You end up changing the code for one
purpose and start needing exceptional conditions so that it also meets your
other purposes for it.
Pretty soon, your common-code-for-the-sake-of-common-code starts to take on
a life of its own. You can't explain exactly what it is meant to do, because
you never really encapuslated that. You can only try to find out what will
break if its behaviour changes. As you build and build on past versions of the
software cohesion itself disappears.
To my mind, the bad design that comes about from prematurely reusing
common-looking code is worse than that of duplicated code. When maintaining
code with a lot of duplication you can start to pull together the common
threads and concepts that have been spelt out by previous generations. If that
code is all tied up in a single method, however, the code may be impossible to
save.
When trying to improve your design, don't assume that reducing duplication
is an improvement. Make sure your modules have a well defined purpose and role
in module relationships first. Work on sharing the concept of the code, rather
than the behaviour of the code. If you do this right, code duplication will
evaporate by itself. Use the eradication of code duplication as a trigger to
reconsider your design, not as a design goal in itself.
In the end, a small amount of code duplication probably means that you've
designed things right. Spend enough design effort to keep the duplication low,
and the design simplicity high. Total elimination of duplication is probably a
sign that you're overdesigning anyway. Code is written to be maintained.
It's a process and not a destination. Refactor, refactor, refactor.
Benjamin