Identifying and Trimming Hidden Container Costs

The majority of Kubernetes workloads are underutilizing CPU and memory, according to a report from DataDog, “11 Facts About Real-World Container Use.” While it’s important to allocate room for fluctuation, an unnecessary amount of waste may be prevalent among Kubernetes deployments. Such waste could be a contributing factor in the $17 billion lost annually on unused or idle cloud resources.

The shift to microservices and containers brings great efficiency benefits and improved deployment fluidity. However, it can also bring a cost analysis nightmare. Tracking performance, utilization and optimizing expense for thousands of unique computing environments can be tricky.

DevOps Dozen 2023

With a desire to trim cloud budgets and meet new environmental pledges, many teams are attempting to optimize their cloud footprint. I recently spoke with the CEO of YotaScale, Asim Razzaq, to learn about trimming costs for cloud-native infrastructure. Below, we’ll discover common cost concerns for containers and Kubernetes and consider some cultural pivots organizations should make to embrace a lean infrastructure.

Cloud-Native Cost Concerns

Razzaq, previously head of infrastructure at PayPal, admits he has a “love-hate relationship with cloud computing.” On one hand, cloud computing facilitates unparalleled agility. However, it introduces a new headache for engineering leadership: tracking hidden costs. With shared cloud architecture, it’s difficult for financial teams to track where spending originates.

Within containerized ecosystems, there’s often a big gap between reserved computing and actual utilization. “Developers tend to overestimate the required resources (number of instances, memory amount or CPU power) to avoid not having enough resources during runtime,” according to the YotaScale blog. As evidence, a recent DataDog container report found nearly half of production containers use less than 30% of the requested CPU and memory capacity.

Razzaq describes the shift from macro to micro as “micro-waste.” In isolation, these costs are nominal, but when they accumulate, they become significant. Furthermore, tracking CPU utilization and memory capacity for thousands of compute nodes is far more complicated than managing a single monolithic environment.

Which pod belongs to which service? Poor namespace labeling can create confusion, as well. IT often juggles many containerized services for analytics, login, payments, logging, encryption and other functions. Lousy hygiene in tagging these services could reduce transparency, making it challenging to correlate use and context, potentially obfuscating showback and chargeback accounting models.

Responding to Hidden Container Costs

Containers are modular by nature, and “it’s easy to feel that you are packing things in the most robust way,” Razzaq says. Yet, there’s still a lot of micro-waste. This could be in the form of underutilized capacity or a forgotten cluster with zero resource utilization left up and running.

Teams are also encountering hidden ancillary costs associated with Kubernetes management. For example, Ops teams have uncovered nasty hidden costs associated with SaaS used to monitor K8s.

To combat rising container costs, Razzaq encourages holding developers accountable for reserving resources and overseeing utilization. He advocates empowering developers themselves with self-service capabilities to observe per-product and per-service consumption. “It puts control back in the hands of developers on what to reserve or what not to reserve,” says Razzaq.

Others agree that increased observability is a crucial means of reducing the costs associated with Kubernetes.

Cultural Container Factors

There are also cultural elements to consider when implementing cost management practices for container ecosystems. For one, Razzaq discourages a top-down style to cutting costs. “A command-and-control ‘central group using a stick approach’ doesn’t fit very well with modern engineering culture,” he says. Instead of relying on external financial stakeholders, empower developers themselves with financial insights.

“There is a myth [that] engineering teams don’t care about cost,” Razzaq says. The reality may be they simply don’t see cost. As the DevOps approach naturally empowers developers with new capabilities, such as shifting security left with automated container auditing, it follows that cost analysis will move left as well.

IT departments that have taken unlimited CPU for granted will especially require a cultural shift. “The perception of infinite capacity can lead to the death of a business,” Razzaq cautions. He recalls the days of programming software to ship with tight constraints, like 64MB RAM. When resources are constrained, engineers get more creative in designing the most efficient processes. There is also a business incentive: “If you can develop systems that are economically efficient, you have a leg up on other talent,” he notes.

Another cultural reason to reduce waste is the environmental impact. It’s estimated that data centers are responsible for about one percent of worldwide energy consumption. Thus, sustainable computing has an ethical cause — to trim carbon emissions. “If you can improve utilization and efficiency, it’s better for the planet,” says Razzaq.

Optimizing Containers and Kubernetes

Container and container orchestration have brought phenomenal benefits to software deployment. Yet, as load fluctuates, weighing reservation and utilization is a constant balancing act.

“Mastering Kubernetes is in no way, shape or form easy. … Failing to manage it correctly impacts both the cost of running the app and its performance and reliability,” writes Steven J. Vaughan-Nichols for Cloud Watch.

For container-based deployments, improper utilization and attribution could quickly increase overall platform cost, and a lack of transparency could obfuscate expenses, Razzaq says. To make finances more transparent, he encourages having engineers on the ground identify and trim idle resources. “Optimize for speed even when you don’t have to,” he recommends.

For large organizations, contextualizing Kubernetes and nodes based on team ownership is essential to realize a showback model. Repositioning cloud spend from idle instances to suffocating services could help balance the total infrastructure portfolio.

As more and more organizations master Kubernetes, optimizing Kubernetes cost is the next logical step. To do this, be strategic, think about engineering empowerment and consider sustainability. “If IT can instill a culture of accountability, self-service and autonomy, you’ll have a win-win situation,” says Razzaq.

Bill Doerrfeld

Bill Doerrfeld is a tech journalist and analyst. His beat is cloud technologies, specifically the web API economy. He began researching APIs as an Associate Editor at ProgrammableWeb, and since 2015 has been the Editor at Nordic APIs, a high-impact blog on API strategy for providers. He loves discovering new trends, interviewing key contributors, and researching new technology. He also gets out into the world to speak occasionally.

Bill Doerrfeld has 102 posts and counting. See all posts by Bill Doerrfeld