Martin Pool has begun a
venture
into constructing a new version control system called
bazaar-NG.
At first glance I can't distinguish it from CVS, or the wide variety of CM
tools that have come about recently. It has roughly the same set of commands,
and similar concepts of how working files are updated and maintained.
This is not a criticism, in fact at first glance it looks like we could be
seeing a nice refinement of the general concepts. Martin himself notes:
I don't know if there will end up being any truly novel ideas, but perhaps the combination and presentation will appeal.
To the end of hopefully contributing something useful to the mix, I thought
I would describe the CM system I use at work. When we initially started using
the product it was called Continuus Change Management (CCM), and has since been
bought by
Telelogic
and rebadged as
CM Synergy.
Since our earliest use of the product it has been shipped with a capability
called Distributed Change Management (DCM), which has since been rebadged
Distributed CM Synergy.
Before I start, I should note that I have seen no CM synergy source code
and have only user-level knowledge. On the other hand, my user level knowledge
is pretty in-depth given that I was build manager for over a year before
working in actual software development for my company's product (it's
considered penance in my business ;). At that time Telelogic's
predecessor-in-interest, Continuus, had not yet entered Australia and we were
being supported by another firm. This firm was not very familiar with the
product, and for many years the CCM expertise in my company exceeded that of
the support firm in many areas. Some of my detailed knowledge may be out of
date. I've been back in the software domain for a number of years.
CCM is built on an Informix database which contains objects, object
attributes, and relationships. Above this level is the archive, which uses
gzip to store object versions of binary types and a modified gnu rcs to store
object versions of text types. Above this level is the cache, which contains
extracted versions of all working-state objects and static (archived) objects
in use within work areas. Working state objects only exist within the cache.
The final level is the work area. Each user will have at least one, and
that is where software is built. Under unix, the controlled files within the
work area are usually symlinks to cache versions. Under windows, the controlled
files must be copies. Object versions that are no longer in use can be removed
from the cache by using an explicit cache clean command. A work area can be
completely deleted at any time and recreated from cache and database
information with the sync command. Atribitrary objects (including tasks and
projects, which we'll get to shortly) can be transferred between CCM databases
using the DCM object version transfer mechanism.
CCM is a task-based CM environment. That means that it distinguishes between
the concept of a work area, and what is currently being worked on. The work
area content is decided by the reconfigure activity which uses reconfigure
properties on a project as its source data. A baseline project and a set of
tasks to apply (including working state and "checked-in" (static) tasks). This
set is usually determined by a set of task folders, which can be configured to
match the content of arbitrary object queries.
Once the baseline project and set of tasks is determined by updating any
folder content, the tasks themselves and the baseline project are examined.
Each one is equivalent to a list of specific object versions. Starting at the
root directory of the project, the most-recently-created version of that
directory object within the task and baseline sets is selected. The directory
itself specifies not object versions, but file-ids. The slots that these ids
identify are filled out in the same way, by finding the most-recently-created
version of the object within the task and baseline sets.
So, this allows you to be working on multiple tasks within the same work
area. It allows you to pick up tasks that have been completed by other
developers but not yet integrated into any baseline and include them in your
work area for further changes. The final and perhaps most imporantant thing
it allows you to do is perform a conflicts check.
The conflicts check is a more rigourous version of the reconfigure process.
Instead of just selecting the most-recently-created object version for a
particular slot, it actively searches the object history graph. This graph
is maintained as "successor" relationships in the informix database. If the the
graph analysis shows that any of the objects selected by the baseline or task
set are not predecessors of the selected objects then a conflict is declared.
The user typically resolves this conflict by performing a merge between the
two selected but branch versions using a three-way diff tool. Conflicts are
also declared if part of a task
is included "accidentally" in a reconfigure. This can occur if you have a task
A and task B where B builds on A. When B is included, but A is not included
some of A's objects will be pulled into the reconfigure by virtue of being
predecessors of "B" object versions. This is detected and the resolution is
typically to either pull A in as well, or to remove B from the reconfigure
properties.
The conflicts check is probably the most important feature of CCM from a
user perspective. Not only can you see that someone else has clobbered the file
you're working on, but you can see how it was clobbered and how you should fix
it. On the other side, though, is the build manager perspective. Task-based
CM makes the build manager role somewhat more flexible, if not actually easier.
The standard CCM model assumes you will have user work areas, an integration
work area, and a software quality assurance work area. User work areas feed
into integration on a continuous or daily basis, and every so often a cut of
the integration work area is taken as a release candidate to be formally
assessed in the slower-moving software quality assurance work area. Each fast
moving work areas can use one of the slower-moving baselines as its baseline
project (work area, baseline, and project are roughly interchangeable terms in
CCM). Personally, I only used an SQA build within the last few months or weeks
of a release. The means of delivering software to be tested by QA is usually
a build, and you often don't need an explicit baseline to track what you gave
them in earlier project phases.
One way we're using the CCM task and projects system at my place of
employment is to delay integration of unreviewed changes. Review is probably
the most useful method for validating design and code changes as they occur,
whether it be document review or code review. Anything that hasn't been
reviewed isn't worth its salt, yet. It certainly shouldn't be built on top of
by other team members. So what we do is add an approved_by attribute to each
task. While approved_by is None, it can be explicitly picked up by developers
if they really need to build upon it before the review cycle is done... but
it doesn't get into the integration build (it's excluded from the folder query).
When review is done, the authority
who accepts the change puts their name in the approved_by field, and either
that person or the original developer does a final conflicts check and merge
before the nightly build occurs. That means that work is not included until
it is accepted, and not accepted until it passes the conflicts check (as well
as other check such as developer testing rigour). In the mean-time other
developers can work on it if they are prepared to have their own work depend
on the acceptance of the earlier work. In fact, users can see and compare the
content of all objects, even working state objects that have not yet been
checked in. That's part of the beauty of the cache concept, and the idea of
checking out objects (and having a new version number assigned to the new
version) before working on them.
I should note a few final things before closing out this blog entry. Firstly,
I do have to use a customised gnu make to ensure that changes to a work area
symlink (ie, selection of a different file version) always cause a rebuild.
It's only a one-line change, though. Also CCM is both a command-line utility
and a graphical one. The graphical version makes merging an understanging of
object histories much easier. There is also a set of java GUIs which I've never
gotten around to trying. Telelogic's Change Synergy (a change request tracking
system similar in scope to bugzilla) is designed to work with CCM, and should
reach a reasonable level of capability in the next few months or years but is
currently a bit snafued. Also, I haven't gone into any detail about the CCM
object typing system or other aspects that there are probably better solutions
to these days anyway. I also haven't covered project hierarchies, or controlled
products which have a few interesting twists of their own.
Benjamin