Loading...


Related Products


Date Published: February 5, 2025

Removing large items from your site's Git history

The following article describes a process you can use to remove large files selectively from the history of code changes stored in your Git repository. This process is most frequently needed if large binary files (movies, images, etc.) have been committed to your code repository, and need to be removed.

Get ready

For starters, follow these commands to clone a "mirror" version of your site's code repository to your local machine. This will give you a "bare" version of the repository that contains the entire history of all refs, branches and tags since its inception. This information contains everything that we'll need to create a backup of your repository, and to clean and replace the existing repository:

git clone --mirror [email protected]:my-repo-name.git
git --git-dir=my-repo-name.git remote update

Back up your repository

Next we'll create an archive of your repo's current state using git-bundle (included with git). In the event that you need to revert the cleanup steps you're about to take, you can use this archive to do just that.

To create the archive run the following command:

git --git-dir=my-repo-name.git bundle create my-repo-name.bdl --all

To verify the archive, ensuring that it contains a complete history of your repository, run the following command:

git bundle verify my-repo-name.bdl

To restore the archive to a directory called my-repo-name in the event of an emergency, run the following command:

git clone my-repo-name.bdl my-repo-name

Now that you've got your backup bundle, store that in a safe place, and let's move on to cleaning up your repo.

Shrink your repository with BFG repo cleaner

The BFG repo cleaner is a Java app designed to intelligently clean large files from your repository's history. It's a relatively fast and powerful tool, and documented very well at the aforementioned link. You can follow a process like the one described below to selectively remove large files from your Git history using this tool:

  1. Download the cleaner app by clicking the "Download" button at the link above, or download it directly from its project page on GitHub. Run the BFG repo cleaner tool on your repository. This command can be run with may options which prune files based on various criteria. For example, the command below analyzes and cleans files larger than 50 MB from your repository:
    java -jar /path/to/bfg.jar --strip-blobs-bigger-than 50M my-repo-name.git
    Finally, once the BFG repo cleaner tool has finished running, run the following command to repackage the trimmed-down version of your repository:
    git --git-dir=my-repo-name.git reflog expire --expire=now --all && git --git-dir=my-repo-name.git gc --prune=now --aggressive
  2. You can repeat steps 2 and 3 multiple times, running the BFG cleaner with different options to whittle down the size of this repo. Use the following command to check its adjusted size as you continue working:
     
    du -d 1 -h my-repo-name.git

Push and propagate the cleaned repository

Finally, when you're satisfied with the size of the repository, push it back up to the remote server with the following command:

git --git-dir=my-repo-name.git push

Once this is done, everyone working on your project should delete their local repositories, and clone down your updated one, to avoid someone accidentally re-introducing the once-removed "dirty" refs back into your repository by mistake.

Did not find what you were looking for?

If this content did not answer your questions, try searching or contacting our support team for further assistance.

Back to Section navigation