Revision Control Systems

or How To Avoid That "Oh No!" Moment

or It'll All End In Tears

or How I Learned To Stop Worrying and Love The Server


It happens to everyone sooner or later. An accidental rm -fr *, the Click Of Death, or just a change to your code that means you can’t recreate your results from your paper a month down the line. If it hasn’t happened to you yet, consider yourself lucky. For the rest of you, I feel your pain. I’ve had all three of those happen to me. It sucks, but it doesn’t have to happen again (or ever).

One of the nice parts about the rise of Open Source Software is the rise of pretty good distributed version control systems. These things are great, they let you back up your data, code and papers to a remote machine. They also let you work with more than one person. They even let you “tag” a particular project state so you can get back to where you were when you wrote that paper last year. They’re also handy if you work on more than one machine like, say, stat and your laptop and you want to keep things synchronized.

Version Control Software


There are a number of different version control projects out there. The most venerable are sccs and rcs, but we won’t really talk about them because they don’t allow for backup to a remote server. There are more “modern” truly distributed VCS’ like darcs and arch that eliminate the need for a central repository, but they’re not widely available so it takes some work to set things up. There are also a number of commercial solutions such as Microsoft’s SourceSafe, Bitkeeper (until recently used by the Linux kernel), and Perforce, but they obviously cost money. The most commonly used are CVS and Subversion (SVN), which is nicer than CVS in many respects, but a little harder to configure.

These last two are the ones I’ll be talking about for the rest of this document. Presently, I use both though I am personally transitioning entirely to Subversion because it allows me a lot more flexibility of management. I’ll mostly be describing a variation of my own personal setup that I use to maintain my own personal research projects on three different computing systems (our group server, stat and my personal laptop) as well as manage pretty much all of my class projects over the years.

Using ''stat'' as a Subversion (SVN) Repository


Setting up SVN Repositories on stat is a bit more work, but I think it’ll be worth it in the end since it does allow a lot more flexibility (things like renaming files in CVS can be a real pain). I also find that its harder to put SVN repositories into undefined states so they’re less of a headache to unravel. Alright, let’s get this show on the road!

Step 1. Creating your first Repository

If you read the Subversion documentation or the online book, you’ll notice that they refer to storing multiple projects in a single repository (essentially the “CVS Way”) or one project per repository. I prefer the latter because it keeps the revision streams separate and makes it easier to move things around in the future. Since creating new repositories is easy, the only downside is a small amount of disk space relative to the size of the project.

To begin, we’ll start by creating a root directory for all of our repositories. I like svn, but you can use whatever you want. Then we’ll change into this directory.

ellis@stat ~ $ mkdir svn
ellis@stat ~ $ cd svn
ellis@stat ~/svn $

Now that we’re in our new directory, we’ll make a project for ourselves. In this case I’ll be assuming that this is a brand-new project called stat214, which will hold all of my homework assignments and final project for Stat 214. We do this using the svnadmin command, which is already available on stat. Later we’ll discuss how to move an existing project into SVN for safekeeping, but these initial steps are always the same.

ellis@stat ~ $ svnadmin create stat214

That’s, uh, pretty much all there is to it.

Subversion, like most version control systems in common use, uses the idea of a “working directory.” This is a “checked out” copy of the code (much like a book from the library) that the user (you!) occasionally “checks in” by comparing old and new versions of the files in the repository and recording the differences between the two using, typically, the diff algorithm. Things get a bit more interesting when there are multiple users and someone checks in a new version of a file before you have checked in your own modified version of the file. If the same lines haven’t been modified, things are pretty easy, otherwise Subversion might require what is called “conflict resolution.” As you might expect this involves cannons at noon and that sort of thing.

Step 2. Creating a Working Directory

Step 2a. Creating A Skeletal Project

Now that we have a repository we need to add some things to it before we can actually create a checked out copy. There are several ways to organize your repository, but in the interests of space and simplicity I’m only going to show you how I arrange things.

This part is actually the same for all projects that I create so, because I’m a lazy lazy man, I’ve actually created a skeleton project that I use to do the initial population of all new projects. It lives in Projects/SVNSkeleton so we’ll go there now.

ellis@stat ~ $ cd Projects/SVNSkeleton
ellis@stat ~/Projects/SVNSkeleton $ ls
branches  tags  trunk

To create your own version of this skeletal project, simply create the branches, tags and trunk directories:

ellis@stat ~ $ cd Projects
ellis@stat ~/Projects $ mkdir SVNSkeleton
ellis@stat ~/Projects $ cd SVNSkeleton/
ellis@stat ~/Projects/SVNSkeleton $ mkdir trunk
ellis@stat ~/Projects/SVNSkeleton $ mkdir branches
ellis@stat ~/Projects/SVNSkeleton $ mkdir tags

Now you’ll never need to do this for a new project ever again! Isn’t that fantastic? I thought so. The clever and the very lazy will now be realizing that they can create several skeletal projects by putting more things in the trunk directory. I have a skeletal R project, a skeletal LaTeX project and others. Yes, I’m that lazy.

Step 2b. Import the Skeletal Project

Now, make sure you are in your skeletal project directory and import those three directories into the stat214 project you created earlier. Remember, if you cut and paste, that you need to change the directory! Trying to write into MY stat214 project won’t work very well!

Importing projects (new and old) is done with the import subcommand of the svn program (which is used for pretty much everything except creating a repository). The next parameter is the URL of the repository itself. Since our repository resides on stat instead of a remote machine we use the file: protocol. Later, we’ll talk about using SVN from a remote machine like your laptop, where we’ll be using different protocols.

Finally we use the -m option to add a Log Message (m is for message!). Most operations that change the repository require some sort of message so if you leave this out SVN will automatically put you in a text editor (pico/nano on stat unless you’ve changed the EDITOR environment variable) to enter a message. You can put something nonsensical here, but its generally a good idea to have good log messages so you can remember why you made the changes you did.

ellis@stat ~/Projects/SVNSkeleton $ svn import file:///home/ellis/svn/stat214 -m "Initialize Project" 
Adding         trunk
Adding         branches
Adding         tags

Committed revision 1.

There we go. Almost done!

Step 2c. The working directory, finally!

Okay, now that the repository is good to go, we can finally check out a working directory where we can start doing some work.

ellis@stat ~/Projects/SVNSkeleton $ cd ~/Projects/
ellis@stat ~/Projects $ svn checkout file:///home/ellis/svn/stat214/trunk 
Checked out revision 1.
ellis@stat ~/Projects $ ls
SVNSkeleton  trunk

Okay, what happened? We used the checkout command and gave it the same URL as before, except that we added the trunk to the end (one of the directories in our skeleton, remember?). This is because the trunk is the “main line” of code. As you get more advanced and your programming gets more sophisticated you’ll be employing the tags and branches directories to branch code that will then be merged into the trunk or tagged so you can find a specific version of your code (or paper. or whatever) later. For now, we want to work on the trunk.

Unfortunately, doing this seems to have created a working directory called “trunk.” How annoying! My project is called “stat214” not “trunk”! You can fix this using mv trunk stat214 or we can avoid this problem altogether.

ellis@stat ~/Projects $ rm -fr trunk 
ellis@stat ~/Projects $ svn checkout file:///home/ellis/svn/stat214/trunk stat214
Checked out revision 1.
ellis@stat ~/Projects $ ls
SVNSkeleton  stat214
ellis@stat ~/Projects $ 

Ah, much better! All done, time to make some files.

Step 3. The Edit-Commit Cycle

 
scm.txt · Last modified: 2006/02/16 15:43 by ellis
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki