Sunday, December 4, 2011

Branching and Merging

Nothing endures but change.

Heraclitus is usually remembered for saying that you can't step into the same stream twice, because the only thing real is change, not rivers or even mountains. Since most code is in about as much flux as the mighty Mississippi, I've often preferred the metaphor of streams to that of branches when thinking about version control. But really there's no perfect metaphor. What best describes a constantly growing and shrinking, branching and merging code base with information going in two directions and which can be frozen at any point?

I've always thought version control was one of the more interesting aspects of software development. Many people don't have a very good grasp of version control. It's another one of those key things they don't teach in school. But, truth be told, there's always something more to learn. Recently I discovered that there are as many ways to organize the branching structure of a repository as there are repositories. I had become accustomed to tying branches and tags (or streams and snapshots) to releases. A great MSDN article describes some of the more popular models.

First, of course, is the branch per release model. It's the model most used by software companies that have to support older versions of code while developing new ones. If you have some customers on 9.3 and some on 2.4, you have to be able to recreate older environments for debugging purposes. You can try to force all your customers to upgrade at the same time, but, given the different testing requirements they all might have, that's pretty unrealistic.

Most interesting of the new models I learned about was the branch per environment model. This makes a lot of sense if you're supporting in-house application code. It allows you to track what's going on in your different environments at any time, and it helps to keep your main (development) stream uncluttered. It can still get a bit messy when you have a lot of different development projects going on at the same time and with uncertain release dates. If your development is not fairly linear, this model may not be easy to implement.

To deal with multiple asynchronous development projects, you could do a branch per task. It's interesting to note that in this model branches have short lifespans, unlike in the previous two models, where they live in perpetuity. If 1.3 happens to go out before 1.2, you just have to merge 1.3 into the trunk and update the 1.2 code. I could see this being used in a relatively simple in-house shop, but definitely not at a software company.

Another option is having a branch per component. In this case, your core architecture is stable and your components do not bleed together. This might work if you have a waterfall-type SDLC in its initial stages and you want to isolate various web services, for example. If you want to avoid integration hell, however, this might not be the way to go.

Finally, you might have a branch per technology if you support multiple platforms. You would have your core code, which you could merge into all your different phone, gaming, or OS platforms. How well this strategy would work probably depends on the amount of overlap possible in the code of the different platforms.

There is no one best way to organize your repository. These models are just that--models. A combination of two or three might work the best for you. For example, you might have a branch per feature in your development stream, but then have fairly simple streams for your other environments. You could have branches for each technology, and then branches for each component within those branches. The possibilities are endless, but an overly complex system is probably more trouble than it's worth. Too few, and you'll have a difficult time developing. Too many and you'll spend all your time integrating. Find a happy medium between and a palm tree and a hedge labyrinth.

-The MSDN article

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...