Every year GitGuardian scans every public commit pushed to GitHub and counts how many API keys, tokens, certificates, and credentials slipped through. The 2024 edition of their State of Secrets Sprawl report covers calendar year 2023 — and the numbers keep going the wrong way.
Why the curve keeps bending up
GitGuardian flags three structural reasons behind the growth: more developers shipping code more often, the rise of generative AI assistants that paste credentials into example code, and the long tail of legacy secrets that never get rotated. Roughly 1 in every 10 authors who pushed code to GitHub in 2023 leaked at least one secret.
The leakiest file types
Plain source files are not the worst offenders. The report ranks file extensions by how likely a commit touching that file is to contain a secret. The top of that list is consistently:
- .env files
- .log files
- .sql dumps
- CI/CD configuration (e.g. .yml, .yaml)
- Notebook formats like .ipynb
Zombie leaks: the secret you 'deleted' is still public
A finding GitGuardian has been hammering for years: rewriting history does not delete a secret. Force-pushing over a leaked commit, deleting the repo, or making it private does not invalidate the credential. GitHub keeps forks and cached views; threat-intel scanners archive everything. The only safe assumption is that any secret that reached a public branch is permanently compromised and must be revoked.
What this means for your team
Two takeaways are worth internalizing. First, prevention alone is not enough — even mature pipelines with pre-commit hooks leak secrets, because contractors, forks, and personal accounts sit outside the control plane. Second, detection latency matters far more than detection coverage. If a leaked AWS key is still valid five days later, you are racing an automated harvester that scans GitHub's firehose in near real time.
Continuous external exposure monitoring closes that loop: assume secrets will leak, and shrink the window between leak and revocation to minutes.
“More than 90% of valid secrets exposed remained active for at least five days after the author was notified.”