How to Remove Sensitive Data from Git Safely
To remove sensitive data from
git, use tools like git filter-repo or BFG Repo-Cleaner to rewrite history and delete the unwanted data from all commits. After cleaning, force push the changes to update the remote repository and inform collaborators to re-clone or reset their copies.Syntax
Use git filter-repo to remove files or data from all commits in your repository history.
Basic syntax:
git filter-repo --path: Removes the specified file from history.--invert-paths git filter-repo --replace-text: Replaces sensitive strings in history.
After filtering, use git push --force to update the remote repository.
bash
git filter-repo --path <file> --invert-paths git filter-repo --replace-text <file> git push --force
Example
This example removes a file named secret.txt from the entire Git history.
bash
git clone https://github.com/user/repo.git cd repo git filter-repo --path secret.txt --invert-paths git push origin main --force
Output
Enumerating objects: 100, done.
Counting objects: 100% (100/100), done.
Filtering repository history...
Rewrite complete.
To https://github.com/user/repo.git
+ abc1234...def5678 main -> main (forced update)
Common Pitfalls
- Not backing up your repository before rewriting history can cause data loss.
- Forgetting to force push after rewriting history means remote still has sensitive data.
- Collaborators not resetting their local copies will cause conflicts and reintroduce sensitive data.
- Using
git filter-branchis slower and more error-prone compared togit filter-repo.
bash
## Wrong: Removing file but forgetting to force push git filter-repo --path secret.txt --invert-paths git push origin main ## Right: Force push to update remote git push origin main --force
Quick Reference
Summary tips for removing sensitive data from Git:
- Always backup your repo before rewriting history.
- Use
git filter-repoorBFG Repo-Cleanerfor efficient cleaning. - Force push changes to remote with
git push --force. - Inform all collaborators to re-clone or reset their local repos.
- Check for sensitive data with
git log --all --grep='password'or similar before and after cleaning.
Key Takeaways
Use git filter-repo or BFG Repo-Cleaner to safely remove sensitive data from all commits.
Always force push rewritten history to update the remote repository.
Backup your repository before rewriting history to avoid data loss.
Notify collaborators to re-clone or reset their local copies after history rewrite.
Avoid using git filter-branch as it is slower and less reliable than modern tools.