Persistent Data Storage Integral for Containers

Persistent storage is a critical element of next-generation IT environments utilizing containers and microservices

The growing popularity of containers coincides with enterprises moving toward application development that favors microservices over traditional, monolithic legacy applications. Thanks to Linux containers, microservices and Kubernetes, applications have become easier to build, manage and transport.

But as great as these technologies are, their lack of stateful, persistent storage remains their Achilles’ heel. Lack of inherent storage within containers themselves presents a challenge for developers who require data to exist in perpetuity after their containers have disappeared.

Initially, this wasn’t a big deal. Containers were purposely ephemeral. Speed and agility were their primary objectives, not storage. Thus, storage and data were considered almost an afterthought.

But as containers grew into a default application development solution, the need for persistent data storage became more apparent. Organizations need agility, but they also need their applications to be able to access data long after the life cycle of a container winds down. This requires a move away from traditional storage processes and technologies toward a more software-defined approach that puts data and storage where it belongs: inside the container.

The Traditional Storage Challenge

How did we get to this point? Like many things, through trial and error.

Developers’ immediate reaction to the container storage challenge was to use traditional storage approaches and platforms to provision their required storage volumes. Unfortunately, they found out pretty quickly that operating under the old rules doesn’t work with microservices and containers, for a few reasons.

First, these platforms were never designed to meet the large-scale scalability requirements of today’s application development practices. An organization could potentially have tens of thousands of microservices running everywhere, from local data centers to locations around the world. Traditional storage arrays cannot scale to meet these highly distributed workloads.

There’s also the question of how to provision storage across such a vast landscape. Typically, a developer would provision storage by putting in a request to a storage administrator, but that could take weeks, thereby undermining containers’ core benefits of speed and agility. Plus, no single developer or administrator can feasibly manage storage across 10,000 microservices.

Persistent Storage: The Software-defined Storage Evolution

And so, just as traditional application development methodologies led to DevOps and microservices, we’re now seeing an evolution toward container-native software-defined storage (SDS).

SDS offers developers with a means to maintain persistent data storage within their containers. Data can be mounted directly to containerized applications, providing them with the data they require to run. When the container disappears, the data associated with the application remains. Nothing is lost despite the transient nature of the container.

But persistent storage is only the beginning. SDS offers other benefits for developers and operations personnel working with highly distributed environments.

Automating Storage Provisioning

Through SDS, developers can automate their storage management needs through a framework like Kubernetes. They can automatically scale storage up or down as necessary, in many cases eliminating overprovisioning and cutting through the layers of IT management normally associated with storage requests. A process that would normally take days or weeks can be accomplished in seconds. SDS can also automate the high availability and disaster recovery of the underlying storage infrastructure.

Meanwhile, storage administrators can maintain visibility into how their storage is being utilized. They can see how much storage developers are consuming, know if they’re achieving the right level of performance and more. They can maintain some level of control without having to be directly involved in the provisioning—an ideal scenario for DevOps environments.

Maintaining Data Proximity and High Availability

SDS can provide an abstraction layer that enables developers to choose where their data lives and how they manage that data. For example, a developer may choose to house most of their datasets in an on-premises datacenter, yet move a small amount of data to the cloud if necessary. They don’t have to migrate massive amounts of data between cloud providers and their own local data centers.

Open SDS allows for this hybrid cloud portability and flexibility because it is highly agnostic. Data is always available to the developer and easily can be managed regardless of their organization’s infrastructure. When tightly configured with Kubernetes, SDS can eliminate the need to build redundant systems that match containers with the infrastructure they’re running on—the orchestration engine does that invisibly at the application level.

Balancing the Old and the New

We’ve become accustomed to hearing the term, “Data is the lifeblood of business,” attributed to how companies use information at the macro level, but data is also vitally important at the microservices level. As containers and microservices become increasingly integrated into the fabric of application development, developers must be able to attach persistent data to those applications. They also must be able to manage storage in new ways that support their agile processes.

SDS is ushering in a new era that addresses all of these needs. It offers developers the opportunity to balance traditional persistent storage and data needs in a world that is becoming increasingly stateless.

Pete Brey

Pete Brey is marketing manager of hybrid cloud object storage at Red Hat, including Red Hat Ceph Storage and the Red Hat data analytics infrastructure solution.

Pete Brey has 2 posts and counting. See all posts by Pete Brey