Red Hat Partners With DOE Labs to Advance HPC Containers

June 1, 2022 Mike Vizard containers, high performance computing, HPC, kubernetes, red hat

Red Hat this week announced it has allied with multiple U.S. Department of Energy laboratories to advance the adoption of containers in high-performance computing (HPC) environments.

Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory and Sandia National Laboratories will collaborate with Red Hat to make sure that the same types of containers deployed in enterprise IT environments are used to drive HPC applications.

Red Hat and the National Energy Research Scientific Computing Center (NERSC) at Berkeley Lab are also collaborating on enhancements to Red Hat’s Podman, an open source daemonless container engine for developing, managing and running container images on a Linux system, that will replace the custom Shifter development runtime NERSC had developed.

Sandia National Laboratories, meanwhile, will extend an existing SuperContainers project to enable containers optimized for HPC applications to run on Kubernetes. That capability will enable those applications to scale up and down more easily.

Finally, Red Hat and Lawrence Livermore National Laboratory are collaborating to bring HPC job schedulers like Flux to Kubernetes environments via a standardized programmatic interface for highly parallel file systems that are widely used within HPC platforms.

Andrew Younge, Ph.D., research and development manager and computer scientist for Sandia National Laboratories, says the goal is to reduce dependency on the Singularity containers used within HPC environments today in favor of containers that are more widely used in other IT environments. That shift will ultimately make it easier to recruit IT talent to work in HPC application environments, he notes.

Shane Canon, a senior engineer for Lawrence Berkeley National Laboratory, adds that this effort will also provide additional ways to scale applications using architectures based on HPC platforms.

Finally, one of the major benefits of Podman is that it enables IT teams to provide different levels of privilege to better ensure cybersecurity, he notes.

Ultimately, this effort should make it possible to more easily manage applications across container runtime platforms running in both on-premises and cloud computing environments. HPC applications still largely run in on-premises IT environments. However, HPC in the cloud can eliminate on-premises infrastructure constraints in a way that makes it easier to pay only for the capacity used no matter how much is required. They enable organizations to innovate without constraint, improve flexibility and possibly deliver faster results using the latest generation of processors made available as a cloud service.

It’s not clear to what degree shifting HPC applications to containers might spur additional innovation. HPC platforms infused with graphical processor units (GPUs) are also driving the development of artificial intelligence (AI) models that require access to massive amounts of computing horsepower.

One way or another, advances in terms of how container-based HPC applications are built and run will eventually benefit a wide range of applications as new concepts are pioneered and then shared across platforms. In the meantime, the number of viable options for running containers at scale in HPC environments continues to increase.