What is version control? Why is it important for due diligence?

by Stuart Yeates on 1 January 2005 , last updated

Introduction

A version control system (also known as a Revision Control System) is a repository of files, often the files for the source code of computer programs, with monitored access. Every change made to the source is tracked, along with who made the change, why they made it, and references to problems fixed, or enhancements introduced, by the change.

Version control systems are essential for any form of distributed, collaborative development. Whether it is the history of a wiki page or large software development project, the ability to track each change as it was made, and to reverse changes when necessary can make all the difference between a well managed and controlled process and an uncontrolled ‘first come, first served’ system. It can also serve as a mechanism for due diligence for software projects.

Version Tracking

Developers may wish to compare today’s version of some software with yesterday’s version or last year’s version. Since version control systems keep track of every version of the software, this becomes a straightforward task. Knowing the what, who, and when of changes will help with comparing the performance of particular versions, working out when bugs were introduced (or fixed), and so on. Any problems that arose from a change can then be followed up by an examination of who made the change and the reasons they gave for making the change.

Coordinating Teams

Resource development is usually carried out by teams, either co-located or distributed. Version control is central to coordinating teams of contributors. It lets one contributor work on a copy of the resources and then release their changes back to the common core when ready. Other contributors work on their own copies of the same resources at the same time, unaffected by each other’s changes until they choose to merge or commit their changes back to the project. Any conflicts that arise - when two contributors independently change the same part of a resource - are automatically flagged when the changes are merged. Such conflicts can then be managed by the contributors.

Typically in open source projects, version control systems allow anyone to read and copy the project resources, but only authenticated users, known as committers, are allowed to update source code in the repository.

Due Diligence

Many activities in business are accompanied by a responsibility to perform ‘due diligence’ checks. Precisely what these checks entail will depend on the business activity in question, but with regard to intellectual property one important ‘due diligence’ activity is the tracking of the ownership of its constituent parts. So for example, if someone creates a piece of software and wishes her organisation to release it, her organisation will almost certainly want to check the provenance of all the code within the software. This process is facilitated by the ability to track who made which changes to the code, and when they were made. A version control system enables a list of contributors to be compiled and the dates of their contributions to be ascertained. Such a list can be easily cross-checked with a list of IP contracts.

Open development involves contributors making small regular changes to resources. A version control system provides a means for monitoring those changes as they occur. Automated systems will notify those responsible for managing the IP in project outputs. These notifications, coupled with the logs provided for each individual modification, allow project managers to monitor and trace all contributions.

Open development demands care concerning the provenance of the contributions. Open development projects need to follow best practice in this area. If an IP infringement is found to have occurred, the version control system can be used to determine the extent of the contamination (which files were affected by the problematic change), who performed the change and when they performed it. A version control system can even be used to recover the last uncontaminated version of the software.

Version control systems can also be used to establish precedence, when there is a dispute regarding the ownership of code or ideas.

Examples

Version control has been closely studied and understood in the software engineering community for a long time. The solutions are stable, robust and well-supported. There are various systems suitable for small local teams and for large distributed teams, making them ideal for coordinating software development, and for mitigating differences in culture and timezone.

Version control is provided at sites such as Github, SourceForge and Google Code. These sites typically build a suite of services around version control: archiving, release downloads, mailing lists, bug trackers, web hosting and build farms. This range of functionality makes them particularly attractive for those projects that do not have the resources to maintain their own server for version control.

CVS used to be the most wdely used open source version control system but these days Subversion and Git have overtaken it are commonly used in open source projects. The basic capabilities of these systems are very similar, but they offer different security, networking and abstraction functionality, and different licences. There are also many proprietary solutions available from a range of suppliers.

As previously discussed, version control is a valuable tool when record-keeping and performing analysis for legal purposes. These topics are discussed in Open Source Development - An Introduction to Ownership and Licensing Issues.

Further reading

Links:

Related Information from OSS Watch: