File versioning
Comic: Jorge Cham. "notFinal.doc"
Version control is a method used to track file or file set changes over time so that you can recall older versions at a later time.

  • Version control records changes (additions, deletions, replacements) of individual files, tracks updates, and allows branching of projects that may be later integrated into the parent project.
  • File versioning can be as simple as using file naming conventions like suffixes *_v1, *v2, *vn, or you could use a version control software (VCS).
  • Version control software allows multiple people on a team to work together on the same project at the same time. It manages changes to all types of text-based files, like scripts and web pages as well as some proprietary digital formats.

Advantages to Version Control Software

There are three advantages to version control software:

  1. Infinite undos: VCS allows users to revert to a previous version of a file or file set to restore or recover a previous state of a program application or website. This allows you to restore accidentally deleted or overwritten files in a project, or to restore file sets from a prior version save.
  2. Branching and experimentation: This is powerful as it allows you to test out new features in programming code or branch out on a different path without affecting your collaborator's work in the production code. Later, your branch can be merged in with the production, saved, or discarded as necessitated.
  3. Collaboration: Collaborators can work locally on the file while the VCS handles the tasks of merging changes and keeping the files and directory trees in sync. The VCS also promotes accountability, tracking who's made which changes and when, so any questions about the changes can be followed up with the appropriate person.
     

Software Tools

Below are a some version control software and and file sharing tools used at Harvard. It is important to note the following:

  1. Since Harvard does not necessarily host each of these products, at the close of a project, be sure to retain a copy of the file sets to be saved in the Harvard or sponsored research archive at the close of the project, they must provide Harvard with a copy of the files.
  2. If your files contain any proprietary or confidential information, you must store these in a Harvard Security Level 4 or 5 certified environment.
  • Git

    Git is a distributed version control tool that can manage a development project's source code history. Git is the most common and widely accepted version control software, which you can run locally on your computer. Learn more about Git.

  • GitHub

    GitHub is a web-based service for Git repositories (i.e., groups of tracked files). GitHub is commonly used for managing and sharing different versions of code for programming projects, but it can be used just as effectively for version control of other types of files, such as text documents. GitHub has a huge open-source community. Get started with GitHub.

  • GitLab

    GitLab is an open source software that provides a Git repository hosting service and collaborative revision control. GitLab has project management, issue tracking, and free private repository hosting. See more about GitLab.

  • Bitbucket

    Bitbucket is a web-based version control repository hosting service owned by Atlassian, for source code and development projects that use Git. Bitbucket tends to have mostly enterprise and business users. Learn more about Bitbucket.

  • Subversion

    Apache Subversion is a server-client software versioning and revision control system. Software developers use Subversion to maintain current and historical versions of files such as source code, web pages, and documentation. Get started with Apache Subversion.

File Sharing Platforms

  • Dropbox

    Dropbox is a file hosting service that offers cloud storage, file synchronization, personal cloud, and client software. Dropbox saves all of your lost files and restores older versions of files. Good for up to level 3 data, external/internal file sharing, co-authoring, and version control.

    Consumer Dropbox for SPH and Dropbox for Business for HMS.

  • OneDrive

    Microsoft 365 OneDrive is a personal file storage for individual workspace productivity and organizational file storage for management of departmental document libraries and files. Good for up to level 3 data, external/internal file sharing, co-authoring, and version control.

    See the HUIT Service Catalog for more on OneDrive and SharePoint.

  • Google Drive

    Google Drive is a file storage and synchronization service that allows users to store files on their servers, synchronize files across devices, and share files. Google keeps track of each revision to the file with built-in version tracking and the ability to get back to earlier file versions. Google Apps for Harvard are supported at some schools.

  • Open Science Framework

    Open Science Framework (OSF) provides free and open source project management support for researchers across the entire research lifecycle. As a flexible repository, it can store and archive research data, protocols, and materials. OSF has built-in version control and retains all copies of a file added to OSF, and further provides access to versions of files stored on third-party storage providers. Get started with Open Science Framework.