Securing Container Images Across the CI/CD Pipeline

May 4, 2023May 3, 2023 Julien Sobrier CI/CD pipeline, container security, containers, Lacework

The Log4j vulnerability was a good reminder that securing cloud-native applications requires ensuring container images are free of critical vulnerabilities. When Log4j went public, security teams struggled to quickly understand which of their images were vulnerable and where they were running within their environments.

An effective container security approach not only identifies vulnerabilities within running images but also helps companies quickly respond when they are detected. How can companies ensure the security of their containers throughout development and delivery pipelines? This article explores best practices that have been observed to be very effective across a number of leading organizations.

Defining Effective Policy is Job One

Organizations must start by deciding what risk they are willing to accept in terms of vulnerabilities: Should all fixable vulnerabilities be addressed, or just high and critical ones? How will they mitigate vulnerabilities that don’t have an available fix? How will they assess the potential impact and blast radius of these vulnerabilities and track and manage them until they can be remediated? And what is an acceptable service level agreement (SLA) for fixing vulnerabilities in active containers?

The policy must be well-defined and understood by everyone who is involved in developing, building, running and fixing images. This includes developers, operators, administrators and security teams. Security and compliance policy must be validated at every stage of the cloud-native application life cycle. This DevSecOps approach enables companies to dramatically reduce their cloud attack surface while also significantly reducing the time and cost associated with fixing issues that have been deployed to production.

Effective policy must go beyond checking for vulnerabilities to include best practices for cloud security hygiene and for creating reproducible golden images and immutable containers, such as:

Not allowing package lists to be automatically updated (apt-get update or yum update) and instead using updated base images hosted in your registry
Not allowing USER to be defined as root
Not allowing secrets to be stored as environment variables

A comprehensive set of checks and best practices can be found in the CIS Benchmark for Docker (section 4) and NIST SP 800-190.

Validate Images at Each Stage of the Life Cycle

Organizations should adopt security tools that use a centralized policy framework to ensure security checks are consistently applied at every stage of the build pipeline:

Development of images: Empower developers to test images locally and fix most of the vulnerabilities before they commit their code.
Code repository: Automate testing of Dockerfiles and other artifacts to ensure all code checked in is safe. Require peer review to approve vulnerabilities or exceptions that are not fixable.
Build time/Continuous integration (CI): Automatically scan containers for vulnerabilities as they are built. Block unsafe images or require an exception process.
Publication to a registry: Eventually, all images must be stored in a container registry. Automatically scan registries to ensure all published images are checked for vulnerabilities.
Deployment to Kubernetes: Validate the security of container images just before they are deployed with a Kubernetes admission controller.
Runtime: Create an up-to-date inventory of all images running. Continuously monitor and identify new vulnerabilities in running images.

An example of a pipeline that enforces a policy at each stage of the build process, from image creation to runtime.

With this pipeline, images are validated at every point of their life cycle. They should not be able to progress to the next step if vulnerabilities are found or if exceptions have not been approved. While validating images at each stage may appear to be redundant or overkill, it avoids relying entirely on one team or a single point of failure.

Enable Developers to Self-Service Vulnerabilities

Developers create Dockerfiles or similar artifacts that define the final container image. If a vulnerability is found downstream in the pipeline, it will likely come back to them to be fixed. Rather than relying on another team or an out-of-band process to validate their images, developers need to have the right tools to test their images locally as they create them:

Automatically assess images against policy as they are built without slowing down developers.
Provide developers with automated remediation guidance to quickly self-service violations, including where the vulnerability was introduced (base image, package installation, etc.), the new version to upgrade to, and more.
Enable developers to detect and remediate vulnerabilities without context-switching and disrupting their current experience.
Gain comprehensive visibility into vulnerability details, and share them with security teams to request an exception for any vulnerability that cannot be fixed

Making it easy for your developers to find and fix vulnerabilities without the need for security teams to intervene will make the overall pipeline a lot more efficient. Finding vulnerabilities further down the pipeline or in production causes the developer to have to go back to a closed project to generate a new image, which is much more time-consuming and costly.

To ensure that developers don’t accidentally skip validating their code against security policies, it is also a best practice to automatically test for risks each time code is checked into the code repo.

Check Within Continuous Integration (CI)

Many organizations have implemented some form of CI that continuously builds and tests new code, including container images, as they’re added to a code repository and packaged for deployment. Images should be reassessed at this stage to catch any images that may have skipped unit testing or any dependencies that have been merged into the main branch unchecked. Critical or high-severity vulnerabilities should break the build and prevent the vulnerable image from being published to a container registry.

Continuously Monitor Image Registries

Images have to be pulled from a container registry to be used in Kubernetes or other container orchestration tools. To ensure that all images are appropriately checked for vulnerabilities even if they somehow skipped your CI pipeline, teams should automatically scan new images added to their registries using registry notification or auto-polling. An alert should be raised if an image was not scanned previously or if new policy violations are identified.

Perform a Final Check in Kubernetes

There is one more chance to check the image for vulnerabilities before it gets deployed to Kubernetes. An admission controller is a Kubernetes mechanism called when a workload is created or updated. The admission controller can check whether the image was previously scanned and if a particular policy was assessed and whether new vulnerabilities have been found since the last scan. The admission controller can block an image from being deployed or raise an alert for future investigation.

Capture a Complete Inventory and SBOM

It is important to keep track of all the scans being performed in a pipeline. If a policy validation fails late in the pipeline, an image likely did not follow the approved CI/CD process or was allowed to progress with known issues. Tracking the different checks throughout the pipeline is critical when investigating gaps in coverage.

It’s also important to keep a complete inventory of images, whether they are in a registry or running in Kubernetes. If a new vulnerability is discovered, it is important to be able to quickly understand where it is in a running container or in a registry waiting to be deployed. The inventory should not only contain a report of all vulnerabilities found but the complete list of packages, libraries and important attributes (layers, user, entry point, etc.) that enable a quick and complete reevaluation of the image without requiring a full scan of the actual image. This list of content is called a software bill of materials (SBOM), which is critical to investigate software supply chain security incidents.

Scanning images for vulnerabilities should not stop once they are deployed. As seen with Log4j, new vulnerabilities in existing software are found every day. An SBOM should constantly be checked against the latest list of vulnerabilities and the appropriate actions taken if new vulnerabilities are found or if the severity of existing vulnerabilities changes. Images may also sit in a registry for days before deployment, and it’s important to be alerted as soon as possible when new vulnerabilities are detected.

Ensure OSS Images are Treated Like Your Code

Many companies have built a secure pipeline process for the images they produce. But they often forget about open source images (Prometheus, Istio, etc.) that get pulled directly from third-party container registries (Docker Hub, Quay, etc.). These open source software (OSS) images often skip the entire pipeline and are only checked at the Kubernetes admission controller as the images get pulled.

The best practice is to treat third-party images the same as your own:

Pull the image into your CI tool for the initial scan.
Push the image to your own registry only if it passes the validation tests.
Prevent Kubernetes from pulling images from outside your own registry.

This would help respond to new vulnerabilities like Log4j in OSS images but also prevent rogue maintainers (as seen recently) or hijacked accounts to push malicious images to your organization.

Another option would be to mirror the OSS image repository into your own registry and perform periodic vulnerability scans as described earlier. If none of these approaches are feasible for your infrastructure, consider performing ad-hoc vulnerability scans within the build process without checking images into registries. The Lacework inline scanner, for example, enables teams to integrate vulnerability checks directly into their software supply chain workflows.

Conclusion

It’s important to build a pipeline with lots of redundancy to avoid any security gaps and to avoid relying on a single checkpoint that could fail or be bypassed. The security of a pipeline will mature over time. Enable developers to produce more secure images when coding, continuously scan registries to increase coverage and check images just before they are deployed in Kubernetes to ensure 100% of images are scanned for vulnerabilities. This will not only dramatically reduce exposure to vulnerability risk, but also significantly reduce the time and cost associated with remediating security flaws in deployed containers.