---
title: "Reducing Acquia Git Repo Size"
date: "2022-11-08T20:37:58+00:00"
summary:
image:
type: "article"
url: "/acquia-cloud-platform/help/90671-reducing-acquia-git-repo-size"
id: "ea72c71c-b399-4ff4-ab85-05c54afaf57a"
---

On long term projects, it is common to see the Acquia git repository size to grow. Even without pushing large files, the addition of commits over the years can grow the repository size to a point which reduce the team or process efficiency. Various operations on the local, CI or Acquia environments require to switch branches or checkout the repository. With a large repository, these operations become slower and penalise and reduce the reactivity of the team.

Most of the steps are applicable whatever the CI/CD used on the project. However, some steps may only apply in the situation where the development team works on an external Git repository (like Github or Gitlab), uses BLT to generate the deployment artifact and push it to Acquia Git. More information about this workflow in [the BLT documentation](/node/56468).

Let's walk through some steps which can be conducted to reduce the repository size and potentially avoid it to grow again.

Many projects are using tags to manage releases on production. It is a good practice. However, if we don't pay attention to it, the number of tags become huge and can increase the repository size. For a team using BLT to generate the release tag from an external repository, it is even less important to keep an old tag history on Acquia Git given these can be regenerated.

Here is a bash script example which fetch the tags from the Acquia Git repository, keep the latest 5 ones and delete all the other ones. It must obviously be adapted to your specific needs, your tag's naming convention (here we assume 1.1.1-build format) and you repository name.

    #!/bin/bash
    
    # The number of tags to preserve.
    nb_to_keep=5
    
    # The Acquia Git repository url.
    repo="org.app@svn-1111.enterprise-g1.hosting.acquia.com"
    
    i=0
    to_keep=""
    to_delete=""
    
    # Get the release tags and keep the last ones (using nb_to_keep).
    refs=$(git ls-remote -t --refs $repo | grep -o -E "refs/tags/.*$" | grep -o -E "refs/tags/[0-9]+\.[0-9]+\.[0-9]+-build$" | sort -r -t '/' -k 3 -V)
    for ref in $refs ; do
     tag=$(echo $ref | cut -d'/' -f3)
    
     if (( $i  echo "Keeping $tag"
     if [ "$to_keep" = "" ] ; then
     to_keep="$tag"
     else
     to_keep+=" $tag"
     fi
     ((i++))
     fi
    done
    
    # Get the all tags and compare against the to_keep list.
    refs=$(git ls-remote -t --refs $repo | grep -o -E "refs/tags/.*$" | sort -r -t '/' -k 3 -V)
    for ref in $refs ; do
     tag=$(echo $ref | cut -d'/' -f3)
    
     for keep_tag in $to_keep ; do
     if [ "$tag" = "$keep_tag" ] ; then
     break
     fi
     done
    
     if [ "$tag" != "$keep_tag" ] ; then
     echo "Marking $tag as to be deleted"
     if [ "$to_delete" = "" ] ; then
     to_delete="$tag"
     else
     to_delete+=" $tag"
     fi
     fi
    done
    
    # Delete the identified tags after confirmation.
    if [ ! "$to_delete" = "" ] ; then
     read -p "Do you confirm the deletion of '$to_delete' tags for $repo? " yn
     echo
     if [ "$yn" = y ] ; then
     for tag in $to_delete ; do
     echo "Deleting $tag"
     git push --delete $repo $tag
     done
    
     git remote prune $repo
     else
     echo "Nothing deleted"
     fi
    fi

Removing the old tags is definitely not the step which will reduce drastically the repository size but it is a mandatory step for the next ones to be efficient.

It is frequent to use a lower environment to deploy a feature branch for early demo, to create a hotfix branch and to deploy it to validate the fix before deployment to production. Many situation may lead to the creation and the deployment of a temporary branch. It is also very frequent to simply forgot about these branches once merged into the main stream. It is very unlikely these branches will be used again and for the teams building the artifact from the external repository, restoring these branches would be an easy operation with BLT deploy.

To avoid spending time to check the branch list on a regular basis, the best is to script the deletion so it can be used to automate the cleaning. Here is a bash script example which assume an existing script get-deployed-branches.sh to return the list of deployed branches, compare against the list of branches in the repository and delete the branches which are not deployed. The creation of this script is not detailed here as it may vary depending the context ([Acquia Cloud](/node/55865), Acquia Cloud Site Factory, ...). Some default branches are also list to never be deleted to avoid mistakes.

    #!/bin/bash
    
    repo="org.app@svn-111.enterprise-g1.hosting.acquia.com"
    
    # Get the branches currently deployed.
    deployed_branches=$(./get-deployed-branches.sh)
    
    # Hardcode some branches for security. These branches are supposed to always be deployed at least on one env/stack.
    deployed_branches+=" develop-build qa-build uat-build"
    
    to_delete=""
    
    refs=$(git ls-remote -h $repo | grep -o -E "refs/heads/.*-build$")
    
    # Build the list of branches to be deleted.
    for ref in $refs ; do
     branch=$(echo $ref | cut -d'/' -f3)
    
     for deployed_branch in $deployed_branches ; do
     if [ "$branch" = "$deployed_branch" ] ; then
     break
     fi
     done
    
     if [ "$branch" = "$deployed_branch" ] ; then
     echo "Keeping $branch"
     else
     echo "Marking $branch as to be deleted"
     if [ "$to_delete" = "" ] ; then
     to_delete="$branch"
     else
     to_delete+=" $branch"
     fi
     fi
    done
    
    # Delete the identified branches after confirmation.
    if [ ! "$to_delete" = "" ] ; then
     read -p "Do you confirm the deletion of '$to_delete' branches on $repo? " yn
     echo
     if [ "$yn" = y ] ; then
     for branch in $to_delete ; do
     echo "Deleting $branch"
     git push $repo :refs/heads/$branch
     done
    
     git remote prune $repo
     else
     echo "Nothing deleted"
     fi
    fi

For projects using an external repository, the commit history on the Acquia Git repository is not really important and has a huge impact on the size. The principle is to keep only the last commit on each branch.

Here are the git command used to reset a branch history:

    git checkout <branch-name>
    git pull --all
    git checkout --orphan <branch-name>-tmp
    git add .
    git commit -m "Starting a fresh orphan branch fro <branch-name>
    git branch -D <branch-name>
    git branch -m <branch-name>
    git push -f origin <branch-name></branch-name></branch-name></branch-name></branch-name></branch-name></branch-name>

Ideally, this step should be done when BLT pushes the artifact during blt deploy. An [enhancement ticket](https://github.com/acquia/blt/issues/4531) has been created on BLT repository to track this and it may become a simple option in the blt.yml file in the future. However, the git push -f operation is longer than a simple git push and can be impacting in some specific cases (pushing to multiple repositories in the case of multiple stacks on ACSF for example). For this reason it may be interesting to script the history reset of all the branches to be used in an automated process.

Here is an example of a bash script reseting the history of all the branches of the Acquia Git repository.

    #!/bin/bash
    
    # The Acquia Git repository url.
    repo="org.app@svn-1111.enterprise-g1.hosting.acquia.com"
    
    # Clone the repository in a temporary directory.
    rm -Rf /tmp/tmp_clone
    git clone $repo /tmp/tmp_clone
    cd /tmp/tmp_clone || exit
    
    # Configure Git.
    git config user.name "Github-Actions-CI"
    git config user.email "noreply@github.com"
    git config checkout.defaultRemote origin
    git config advice.detachedHead false
    
    refs=$(git ls-remote -h $repo1 | grep -o -E "refs/heads/.*-build$")
    
    # Reset the history of each branch and push the result.
    for ref in $refs ; do
     ref_name=$(echo $ref | cut -d '/' -f3)
     
     git checkout $ref_name
     git checkout --orphan $ref_name-tmp
     git add .
     git commit -m "Starting a fresh orphan branch for $ref_name" --quiet
     git branch -D $ref_name
     git branch -m $ref_name
    
     git push -f origin
    done
    
    git remote prune origin