Fresh Secrets From The Docks: What 15 Million Docker Images Taught Us About Cloud Security
Secrets—API keys, passwords, private tokens—are the skeleton key to any cloud or DevOps kingdom. Yet, despite years of security education, they’re still regularly baked into source code, deployment pipelines, and, as our latest research shows, Docker images. The findings? Alarming. Our deep dive into 15 million public DockerHub images yielded over 100,000 valid, exploitable secrets—many belonging to major enterprises.
Why Secrets Leaks Matter More Than You Think
Unlike classic vulnerabilities, a leaked secret can give an attacker instant, legitimate access, bypassing most detection tools and blending in with normal activity. In 2024, we saw high-profile breaches (think Disney, Capgemini, SolarWinds) originating from hard-coded or accidentally leaked credentials. The sources are everywhere: GitHub, binary releases, cloud storage buckets and, crucially, public container registries like DockerHub.
Unpacking the DockerHub Leak
The Hunt: Methodology at Scale
Container registries are treasure troves for attackers. DockerHub, as the world’s largest, was our focus. To scope the landscape:
- We enumerated 9.3 million unique repositories, belonging to over 3 million users.
- To avoid petabytes of unmanageable data, we intelligently sampled up to 5 tags per repository, focusing on the most likely candidates (e.g., latest).
- We prioritized layers built from COPY and ADD Dockerfile instructions—where secrets are most likely to be stashed—while skipping noisy, oversized ML model layers.
Our pipeline scanned over 16 million layers, totalling 30TB of data, using GitGuardian’s open-source tool, ggshield.
The Alarming Results
- 1,179,475 unique secrets detected (API keys, credentials, private tokens).
- Over 100,000 were still valid—offering real, active access to cloud providers, databases and code repositories.
- 3.1 million images contained at least one secret.
- AWS, GCP, and GitHub tokens for Fortune 500 companies were among the findings—sometimes granting direct access to production resources.
Why This Happens: Docker’s Dirty Secrets
Many developers think they’re safe by using build-time variables, environment args, or deleting .env files post-build. In reality, Docker “remembers everything”: secrets injected during build (via ARG, ENV, or even RUN echo … > file) often persist in image layers or config files, retrievable by anyone who pulls the image.
A classic anti-pattern:
COPY .npmrc .
RUN npm ci && rm .npmrc
.npmrc is deleted in the running container, but remains in the image’s history—fully accessible to an attacker.
Even worse, build arguments thought to be ephemeral can be embedded in image configs. The right way? Use Docker’s secret mount functionality—though it’s less widely adopted and a bit harder to use.
Attackers Are Watching: And Automation Is Easy
Attackers search for credentials everywhere: public GitHub, package managers, and now, registry logs and container layers. With tools and APIs readily available, scanning DockerHub for secrets is as trivial as running a script.
A valid leaked secret can be used for:
- Spinning up expensive cloud resources (crypto mining, anyone?).
- Dumping databases.
- Pivoting into private infrastructure or supply chain attacks.
Responsible Disclosure: The Human Factor
Identifying a secret is only half the battle. Figuring out who owns it and alerting them before attackers act is complex. Our research leveraged DockerHub metadata, secret content (like embedded emails), and even OSINT techniques to notify as many affected parties as possible.
GitGuardian’s Good Samaritan initiative on GitHub already sends out 5,000 warning emails a day. Scaling this responsible disclosure to DockerHub is the next frontier, though manual vetting remains essential for high-impact cases.
What Should Cloud-Native Teams Do?
- Scan your images—before pushing to any registry. Tools like ggshield make this easy and free.
- Educate developers—on Docker’s layer persistence, and why deleting a secret during build isn’t enough.
- Embrace modern secret management—use secret vaults and ephemeral credentials, not hard-coded secrets.
- Simulate a breach—practice “what if our AWS root key leaked publicly?” Drill your revocation and incident response.
Final Thoughts
Leaking secrets is rarely intentional—it’s a byproduct of complex, fast-moving DevOps. But as our research shows, the risk is real, persistent, and growing. If your team uses Docker (and who doesn’t?), make 2025 the year you audit your images—not just your code—for secrets.