Tuesday, October 09, 2007

Best Practices

A colleague had this collection of PDF documents on various aspects of building software. One, entitled High-level Best Practices in Software Configuration Management talks about how to get the most out of your source control system. It was written at Perforce Software, Inc., and though it tries to be general in nature, it reflects their product, which i've never used. It was written in 1998, and was obsolete at that time.

The paper talks about branching and code freezes and codelines and workspaces and builds and the processes that an organization must have to do to cope with a serious problem. Merges.

So, in the old days, one used a source control system, like SCCS or RCS, and of course others. In these systems, the source code cycle steps through these points:

  • check out the code with an exclusive edit lock
  • edit the code, and test it
  • check the code back in.


This works OK, as long as you have only one developer, and one workstation for that developer. But the moment this isn't true, then at some point in development, one developer will have a file locked that another developer wants to edit. So, either the second developer waits for the lock to free - perhaps by doing something else, or the second developer asks the first to add her changes too, and check them in. Or perhaps even more creative solutions are explored.

But often, the second developer is working on a version of the code that will be released at some other time than the first developer is working on. Same application, just code that won't be released for an extra month. Then, each developer needs their own set. The usual way to do this is for at least one developer to create a branch, and work there. Now, when the first developer finishes his release, the code is checked in.

The second developer has a new problem. The code she started with isn't the code that's now in production. Changes made for the first release aren't in the code set for the second release. These changes need to be merged into the new set. The key point in this Best Practices paper is Get the right person to do the merge. It's important to do this step right because it is error prone, tedious, and did i mention error prone? I've done this work. I've even been the right person for the job. I volunteered for this work because I wanted it to be done right. It wasn't that my efforts would be lost if it wasn't done right. It was that the team's progress could be lost if it wasn't done right. And no one else on the team seemed to understand the problem. And yet, the correct right person to do the merge isn't a person.

As early as 1993, i was using CVS. Here, the computer performs the merges. It's fast and reliable. If the merge detects a conflict, it notates this in the code and allows the developer to fix it. But because of this simple change, the whole source code control flow changes. Now it's like this:

  • check out the code
  • edit the code, and test it
  • update the local copy with changes from the repository
  • check the code back in (but leave this new copy out).


Now, the only time that the source code is locked is for a few seconds while the code is being checked back in. Since the code is nearly never locked, any developer can edit whatever they want whenever they want to. The merge happens during the update.

A merge conflict happens when there is a change to the same line of code. When that happens, the two versions of the code are marked in the updated file. The developer edits the file, figures out if one version, the other, or some new code is needed to resolve the conflict.

One might think that since updates are performed all the time, that developers would be constantly fixing merge conflicts. This is not the case. Generally, if two developers change the same line of code, it usually means that they are working on the same problem. If they aren't working on the same problem, there are seldom any conflicts.

So, in a multiple release system, branching is still needed. But the merging process can be automated. And it's long past time that we should need to do this by hand.

So, this is old news, right? Why rehash it now? Because vendors still sell obsolete software. For example, i work for a company now that uses Serena's Dimensions product. This product, based on PVCS, is an old style edit-with-lock and manual-merge system. It has some nifty work flow stuff layered on top, but the hard problems are still hard, and color the way any work is done. To wit, our current release is projected to be at least two weeks late because it was noticed too late that a merge would have to be made. (Sorry, Serena's site seems to be dehanced for Firefox with a pointless entry gate screen. Entry gates went out in about 1995.).

To be fair, i have no idea how Serena thinks that an issue like: Development processes involve time-consuming and unreliable manual hand-offs can be solved with their product. As near as i can tell, if you want overlapping releases, you are stuck with manual merge.

So who bought into this system? Why aren't we using CVS or SVN, which are free? Did they consult with senior developers? I wasn't consulted. These are Best Practices. It must be some definition of Best that i've not heard.

No comments: