A Glimpse of Kubernetes’ Stateful Future

Stateful workloads on Kubernetes are a bad idea. So, why do we keep talking about running databases and other stateful apps on Kubernetes? Probably, because it’s unavoidable.

Most apps have to deal with state at some point. That means if Kubernetes isn’t managing state, it’s only partially addressing the challenges we face on the cloud.

Don’t get me wrong—stateful workloads on Kubernetes are still a challenge. Containers are meant to be ephemeral and storage is not. A database recovering from failure is still a much more complicated reconciliation loop than that of a crashed web server.

What we’ve seen at recent KubeCons is that effective patterns are emerging. Both Operators and CSI do a lot of work to reduce the complexity of stateful workloads on Kubernetes. They reduce risk, simplify recovery from failure and de-couple components in a smart way.

What’s interesting is that even though Operators and CSI are still astoundingly important developments in the world of Kubernetes, none of the upcoming KubeCon Storage talks are about them. Why?

Opening Pandora’s Blocks

I recently had the good fortune to act as KubeCon + CloudNativeCon North America 2020’s Storage track co-chair. As part of the talk selection process, Saad Ali (my co-chair) and I certainly reviewed plenty of talks focused on Operators and CSI. Many of those talks were quite good, too.

Ultimately, the selection process favored talks of a different sort. Those that prevailed are viewable on the KubeCon schedule, today. They’re talks about upcoming storage drivers such as COSI and PMEM-CSI. Others are about the delicate job of tuning distributed storage, such as Vitess and Ceph. Regardless of the topic, all of these talks share a common thread: Operators and CSI are present, but they’re not the focus.

Operators? Yeah, they’re nice, but they’re an implementation detail. CSI? It’s cool and all, but it’s also just an API. Isn’t it more interesting to see what you can do with that API?

I think the reason none of the talks specifically focus on Operators or CSI might be that they’re in the process of becoming a “boring” part of the stack. Both Operators and CSI have matured to the point that they’re just another building block in our collective foundation. Considering CSI only went GA in 2019, that’s an impressive and fast journey to maturity.

Beautiful Chaos

Via https://twitter.com/andrew_randall/status/1304054737634045955

The speed of innovation can certainly be exhausting, but it’s often also exciting. With Operators and CSI maturing, space has been made for new possibilities to be explored. Exploration is essential because top-down prescriptive strategies rarely work in the cloud. Polyglot “the team owns the workload” approaches are the norm. A smorgasbord of choice is what fuels innovation in this world.

At MayaData, we talk about this mindset as container-attached storage, or CAS for short. The idea is that every team is building with their own set of tools and accordingly will need different knobs and levers for tuning and managing the workload. Many of the storage talks at KubeCon are about providing exactly those kinds of knobs and levers.

Grab a Blanket and Get COSI

There’s room for both generalized and specialized solutions to flourish. For instance, MinIO are giving a talk that argues why object storage ought to be treated as a first-class storage option for Kubernetes. They’ll explore the Container Object Storage Interface (COSI), which is a proposal for adding native support for object storage. This is a nice addition to the file-and-block options that are already supported by CSI.

In a similar vein, Intel is exploring how Persistent Memory (PMEM) might speed up storage workloads. It’s like memory, it’s like disk storage, and it might just be a great complement for IO-bound workloads. Intel will share its work on the PMEM-CSI driver that makes all of that possible on Kubernetes.

In addition to these storage track talks, I’m excited to see what folks are doing with emerging technology such as DPDK and SPDK. They’re a set of tools for the novel technique of connecting apps in userspace directly to hardware. I’ve already seen huge performance gains with MayaData’s SPDK implementation, called Mayastor. I’m hoping I’ll get to see what others are doing with the same open source tech, in the hallway track.

Stateful’s Future

Stateful workloads are a bad idea. That’s the phrase we started this article with, and for some organizations that might still be true. That said, a lot has changed!

Innovations such as Operators and CSI have removed a lot of the pain early adopters once experienced. That opens up the power of Kubernetes to the whole application stack, not just the stateless parts.

On Nov. 17, we’ll get to dive into the details and see just how much has changed. Talks about automating and tuning Ceph to work well with Kubernetes promise great practical advice. In contrast, PingCAP’s presentation on how they achieved a 10x speedup in TiKV offers us a brief glimpse of the future.

This article is part of a series of articles from sponsors of KubeCon + CloudNativeCon 2020 North America

Paul Burt

Paul Burt is the Director of Community and Marketing at MayaData. He previously worked at CoreOS and Red Hat, and currently moderates /r/kubernetes. Paul has a knack for and demystifying infrastructure, and making gnarly, complex topics approachable. He enjoys home brewing beer, reading independent comics, and yelling at his computer when it doesn’t do what he wants.

Paul Burt has 1 posts and counting. See all posts by Paul Burt