Every time you add a file to a Git repository it stays there. Even if you delete it again it will remain in the Git history - which is a good thing in most cases.
But sometimes I have accidentally added a large dummy media file to the source code and committed it to Git without considering the impact on the Git repository file size. This is not problematic in itself as Git handles large files just fine - but having done that just a couple of times will leave you with a seriously bloated repository.
I have also on occasion added a file containing passwords, which should never be part of the source code (especially when working with public repositories), and I don't want the passwords to show up in the history, so simply deleting them again won't cut it.
There is a way to permanently delete these files from the Git og Git history and lighten up your repository (and your conscience) again.
Use the filter-branch command to remove the file from history and cache.
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch __FILE_TO_DELETE__' --prune-empty --tag-name-filter cat -- --all
Fx. if I want to delete src/config/connect_db.php file (which contains database password) from a repository.
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch src/config/connect_db.php' --prune-empty --tag-name-filter cat -- --all
The file is now removed from the local Git and you can push the updated repository to your master repository.
git push origin master --force
In addition you might want to help Git get rid of the old references.
Push changes made to tags:
git push --tags --force
Delete the backup references:
rm -rf .git/refs/original/
Expire references in log:
git reflog expire --expire=now --all
Run Git garbage collector:
git gc --aggressive --prune=now
That's it. Your repository file size should now reflect that the clean up was successful.