Git locks

16 February 2019 · 3 minute read · technical

At work, we use git. I write code on my local machine, push it to my virtual machine for testing, and then when ready push from my virtual machine to our Github enterprise instance, from which it is deployed to production. I frequently use branches for development, though generally only push to master (aside from when I make pull requests for code review). We mostly use a single repository, but I there are a half-dozen or so less-frequently-used repositories which I like to keep up-to-date on my VM.

We try to work with an entirely linear history, which means I’ll generally git pull --rebase before pushing. This avoids merge commits.

Because I can sometimes be working on a branch for a few weeks before pushing the code to master, I like to regularly rebase to ensure I’m not generating merge conflicts. I do this by keeping origin/master as the remote for branches on my VM, and doing a pull --rebase regularly. In fact, I have a cron that pulls every 20 minutes. If something goes wrong (e.g. there’s a merge conflict), it hits anybar on my laptop to let me know I need to resolve it. This doesn’t happen that often, for which I’m grateful. I can then manually pull and rebase on my laptop, which gets the changes from my VM. I can use my VM to rewrite history as well, which I frequently do (e.g. squashing commits).

This arrangement has worked well for me for a couple of years, though every now and then I run into trouble. Most frequently this happens when I end up trying to perform two git actions at once – a classic example is performing a rebase to rewrite history while my cron is trying to pull and rebase. This results in one or both operation failing, and sometimes annoying losses of commit messages. It’s nothing disasterous, but I figured there must be a better way.

Enter flock. This acquires a lock on a file before running a given command, either failing if the lock can’t be acquired or waiting until the lock can be acquired before continuing. In this way I can avoid performing two git operations at once, instead waiting or failing as appropriate. I initially had a specific file that I would lock in my home directory, but I quickly realized it’d be better to acquire a lock on the .git directory itself. This is always present where operations can be in conflict, and it is scoped specifically to the repository in question. To do this I created ~/.locking_git file, as follows, which I source in my .bash_profile and also in any script where I want git to lock.

#!/bin/bash

set -euo pipefail

GIT=`which git`

function git_dir {
    # `git rev-parse --show-toplevel` prints the directory which contains the .git directory
    _DIR=`$GIT rev-parse --show-toplevel 2>/dev/null`
    if [ $? -eq 0 ]; then
        echo "${_DIR}/.git"
    else
        echo "/dev/null"
    fi
}

function git {
    GITDIR=`git_dir`
    if [ -d $GITDIR ]; then
        flock $GITDIR $GIT "$@"
    else
        # don't try to acquire a lock if this isn't a repository (e.g. with `git init`, or a command that will fail anyway)
        # this is important because `flock` will create the file if it doesn't already exist, which can confuse git.
        $GIT "$@"
    fi
}

Now whenever I run git in my terminal, I’m actually running the git function defined above, which acquires a lock if we’re in an existing repository. I can also use git_dir in other scripts to acquire the lock in other situations, for example when calling the script that creates a new pull request.