At one client we have a git repository that is almost 2 Gb large. (Don't ask why, it is a long story.) Among many other things one problem this causes is that in our CI environment (using Jenkins) every build takes ages. (Cloning this repo takes between 5-30 minutes depending on the weather and the alignment of the stars.

If we have fixed Jenkins agents it would be fast, because Jenkins would maintain a clone of the repository and would only need to update the repository with the most recent changes.

However in our situation we used on-demand started Google Compute Instances.

That meant every time ws started a new GCP instance it had to clone the whole repository. Wasting time, bandwidth, and money.

To improve the situation our solution was the following.

We created an image in GCP and included a full clone of the repository in /opt/code-maven. Then we used the reference option of git to provide a local reference clone of the repository.

Then, when we wanted to clone the repository we had code like this:

examples/jenkins/clone_using_reference.jenkinsfile



def checkout_from_reference(commit) {
    def reponame = 'code-maven'
    def repo_url = "git@bitbucket.org:USER/${reponame}.git"

    echo "Checkout SHA from reference $commit"
    checkout([
        $class: 'GitSCM',
        branches: [[name: commit]],
        doGenerateSubmoduleConfigurations: false,
        extensions: [
            [$class: 'RelativeTargetDirectory', relativeTargetDir: reponame],
            [$class: 'CloneOption', reference: "/opt/${reponame}"]
        ],
        submoduleCfg: [],
        userRemoteConfigs: [
            [credentialsId: 'jenkins-git-credentials', url: repo_url]
        ]
    ])
    // just to show we are in the right commit:
    dir(reponame) {
        sh(script: "git rev-parse HEAD")
    }
}



This function can get a sha1 or a name of a branch to check out.

Internally it has two variables reponame is the short name of the repository. repo_url is the URL to the original repository. In this case it was in bitbucket.

There is also the name jenkins-git-credentials which is the name of the credentials we added to Jenkins to be able to access the git repository in Bitbucket. If your git server needs authentication, you'll need this configured manually in Jenkins.

Every now and then we had to update the image with the recent changes in the repository so the gap between what we already have in the image and what we have in the reomte repository won't grow too much, but we had to keep the image up to data anyway so this is not extra work.