Six Keys for Platform Teams to Operate Kubernetes at Scale

For organizations looking to modernize and build cloud-native apps, Kubernetes (K8s) has become the orchestration platform of choice. K8s is a great way to provide better service to customers, gain a competitive advantage when it comes to your products and services and speed up digital transformation initiatives with self-service. But building out your development infrastructure with enterprise-level Kubernetes can be complex and challenging and requires a significant investment in terms of time and resources. In this article, we will detail some of the critical requirements that make Kubernetes management more manageable and help your organization put K8s to work for you through a platform approach.

Taking the Complexity Out of Enterprise-Grade Kubernetes

While K8s has quickly become the de facto choice for many organizations managing applications in both multi-cloud and hybrid cloud environments, organizations are reporting setbacks that come with implementing Kubernetes. These include the fact that K8s can be very complex and that while do-it-yourself (DIY) Kubernetes solutions are a great way for DevOps teams to get started with K8s and adopt the cloud, it can also mean significant scalability challenges in terms of managing time commitments, costs and resources.

That’s why when some companies begin to scale their K8s projects, the projects frequently stall. It turns out there is a significant difference between using Kubernetes in a lab environment and actually being production-ready at enterprise scale.

The good news is that with the right approach, Kubernetes is not only good at scaling—it is excellent at scale. Which brings us to our [drumroll please] six keys for platform teams to enable enterprise-grade K8s operations. These six capabilities can help your organization put Kubernetes to use without fear of growing pains, spiraling complexity or unsustainable resource needs. If you bake these six tips into your central architecture and platform, you’ll be set up for long-term success:

Key Number One: Security and Governance

Security and governance are critical items for every IT organization, Kubernetes or not, so it’s no surprise that this makes the top of our list. But how can you protect your most important infrastructure while still making it easy for your automated systems—not to mention your cloud ops/SREs—to access the clusters that make K8s shine?

A few important security and compliance items to consider:

  • First, start with K8s controller cloaking that will help prevent unchecked user access. Also, make sure your audit trails are ironclad.
  • Second, tie your K8s access to an enterprise-grade single sign-on (SSO) system.
  • Third, enable role-based access control so that no one has more (or fewer) permissions than they need to accomplish their job.
  • Finally, ensure you have immutable audit logs of all user activity (even via kubectl) streamed to your SIEM platform of choice.

Key Number Two: Fleet Scalability

Managing just a few clusters is a relatively straightforward task—but many organizations fail to properly forecast what will come as their fleet begins to scale. How can your organization plan for the operational needs and risk management associated with a fleet of K8s clusters?

As you scale K8s, it becomes critical to enforce policies and strengthen governance to reduce operational complexity. Platform and operations teams are tasked with building a library of policies, monitoring and validating rules, preventing configuration drift and proving security and compliance. You should also consider the life cycle maintenance of your clusters—basically, what is the time and resource investment required for upgrades, integration, etc. as you scale?

If you set up consistent, repeatable processes for cluster and application operations, it can help make scaling more straightforward and ensure you are not hit with surprises in terms of the time or resources it takes to grow.

Key Number Three: Multi-Cloud Support

For today’s modern development teams, it is rare that an enterprise is limited to a single cloud provider. In fact, according to Gartner, 81% of enterprises use two or more. So how can organizations faced with this level of complexity help address the need for support across several different cloud providers?

Fortunately, all major cloud providers have managed K8s services that help remove some of the challenges that come with control plane reliability, upgrades and patching. With deep integrations across many of the popular Kubernetes services, this helps enterprises have freedom in terms of what cloud service they choose to meet their specific needs. Also, working with an enterprise-grade Kubernetes operations solution for Day 2 operations can provide a central source of truth in terms of visibility, no matter where a specific cluster may reside, to make management, scaling and operations simpler.

Key Four: Supporting Integrations

For Kubernetes to scale at an enterprise level, it needs to support the many integrations that are necessary to make modern IT teams thrive such as security tools (e.g., HashiCorp Vault) and monitoring and analytics tools (e.g., Prometheus, Datadog). With many integrations comes the need for these tools to be supported by operations engineers and administrators continuously—which can mean a ton of manual work, if not done properly. 

In order to support existing IT pipelines, enterprise-grade Kubernetes needs to allow simple, built-in integrations for developer access to a wide variety of necessary tools and workflows. You should also allow custom software add-ons typically for cluster-wide services, such as service mesh, ingress controllers and security products. 

Key Five: Reducing Total Costs of Ownership

Estimating the cost of K8s can be difficult. Kubernetes and many ecosystem tools are open source, which comes with the assumption that they have a zero-dollar price tag. But it is important to look down the line when considering the true total cost of ownership for your Kubernetes management and operations. What will your costs look like in one year to develop, support and upgrade this new orchestration layer? How about in three years? 

Our recommendation is to look three years down the line to better understand what your K8s infrastructure will cost. It is also important to look at how much time and resources you will invest in building your K8s capabilities and the impact of investing them in what differentiates you instead. How much time will you spend on maintenance, and how much time will it take to scale? Forecasting these unintended cost variables can save you a lot of time and headaches in the future.

Key Number Six: Support at Enterprise Levels

Community support is a wonderful thing and is one of the essential items that has helped Kubernetes thrive. But hinging your business’s success only on community-based support for Kubernetes can be a massive risk. What if something catastrophic happens to a mission-critical application and you need support right now? When you are talking about business-critical infrastructure, make sure you have support around the clock so you are never in a situation where you need third-party support and can’t get it.

Bringing it All Together

The beauty of Kubernetes is that with many open source resources available, it’s easy for any organization to get started reaping the benefits of K8s. But when it comes to scaling K8s at an enterprise level, the operational requirements and complexity can grow exponentially. So, many platform teams aren’t thinking about K8s as a point solution, but as a shared service that developers, QA and operations/SRE teams can leverage to speed deployments and accelerate their modernization journey. By keeping the key platform considerations above in mind, platform teams can be intentional about a company’s growth plans and how they will affect your K8s infrastructure. Take the time to forecast how your needs will change (and what investment will be necessary to meet these needs), so Kubernetes can help your team thrive.

Kyle Hunter

Kyle Hunter is Head of Product Marketing at Rafay Systems, a platform provider for Kubernetes Operations. Kyle is a creative product leader with a demonstrated record in messaging and positioning, competitive differentiation, go-to-market strategy, and thought leadership. He has innovative experience leveraging exceptional business acumen and technical expertise to conceptualize and execute strategies driving company and market growth.

Kyle Hunter has 2 posts and counting. See all posts by Kyle Hunter