Monday, April 15, 2013

Why modern version control stinks

I have a very tech-savvy brother who, although he is not a coder, I am absolutely certain he could pick up how to use most of the tools we software developers use. That is, with one exception- version control systems. Why do I believe I can teach my brother about complex IDE's but not version control? Because modern version control stinks!

I often think about what a version control system designed by Steve Jobs would look like. I don't know the answer but I am 100% certain it would not look like git, mercurial, or subversion. Linus is undoubtedly a genius but he was the wrong person to design the most used version control system in the world. Here are some thoughts on why git is hard to use.

Git, mercurial, and subversion were all developed within the last ten years or so. Even though they were developed in the 21st century, they seem to be designed with 1980's constraints. They are all optimized to be very disk efficient. They trade minimizing the disk impact of recording raw differences in files for recording more useful data about the programming process. Hard disks are tremendously inexpensive these days and are moving toward obsolescence. So, why don't we fill our cheap hard disks with more information about the programming process?

What information should we be recording about how a developer works with a set of files? Well, how about all of it? Because modern version control systems don't record this information one cannot:
- Write good commit messages
When writing a commit message one cannot review all the changes that were made since the last commit. Without being able to see all the changes one often can't recall all the interesting twists and turns that have happened in the coding session. More importantly, one cannot write down the reasons why some decisions were made. We remember some of these reasons for a certain amount of time but it would be nice to have a record of low level decisions for anyone who has to maintain the code later.

How often have you found yourself paralyzed by trying to come up with a short description of all the changes that took place since your last commit? It can be as hard to describe the changes as it was to write the code itself. I believe this is the cause of most of the extremely short commit messages that provide little or no value. If one could review their work I think we would have better commit messages.

- Visualize changes easily
The tool 'diff' shows the raw differences between two files. However, it does not show the order that the changes were made and does not show any context about the changes. The 'context' that I am talking about are the reasons why the changes were made. If one were able to see what order the changes were made and have a narrative to go along with those changes one might have a better sense of how the pieces fit together.

- Tell stories about the system
Commit messages could help us understand the changes in our systems if we wrote good ones. Unfortunately most of us do not. So, if we don't have meaningful commit messages and we can't visualize how the changes were made, it is unlikely that someone will easily understand the evolution of a piece of software.

Imagine that one could visualize the changes as they were made, and during this visualization a developer could pause the playback of the code and write a message about why they made a decision. Now imagine if a series of these messages could be tied together and linked to the code being played back. One could then review one's work at a natural stopping point and tell a story about the latest changes. If these stories could be searched directly from the code, for example by highlighting some code and seeing what stories include that code, then someone responsible for maintaining code could get inside the mind of the original developer and understand their thought processes. This is valuable information that doesn't get written down anywhere else.

- Teach your colleagues
With the ability to record the reasons why we do things that is linked to the code but not in code comments, developers can more easily become teachers. Working on a computer is such a solitary activity that it is hard developers to learn from each other. You might be sitting three feet away from a great programmer and never learn anything from them. We need a way to open up the programming process so that we can learn from each other.

Each one of these deficiencies limits the amount of knowledge one can acquire by working with legacy code. Each can be remedied if we rethink how version control systems can help us understand how our code has evolved. The version control system I am developing records every single keystroke, delete, copy and paste, and file operation. This information can be used to replay development sessions. A developer can replay certain parts of how their system has evolved and tell stories about it.

No comments:

Post a Comment