Survey: Few IT Teams Can Continuously Optimize Kubernetes Clusters
A survey of 321 Kubernetes practitioners at organizations with more than 1,000 employees published today finds that while 89% recognize that automation is crucial, only 17% are able to continuously optimize the infrastructure their clusters have been deployed on.
Conducted by CloudBolt Software, a provider of a platform for optimizing the management of IT platforms, the survey finds 59% of the IT professionals working in larger enterprise IT environments are able to deploy to production automatically. However, 71% require human review before applying any type of resource optimization.
CloudBolt COO Yasmin Rajabi said the survey suggests that many IT teams are still hesitant to fully optimize allocation of processor and memory resources, mainly because of past experiences with manual processes. However, as the size and number of Kubernetes clusters that need to be managed continues to increase, more IT teams will need to rely on automation, she added. In fact, two-thirds of respondents (69%) report that existing manual optimization of cluster infrastructure breaks down some time before approximately 250 changes per day.
The challenge, of course, is that many IT teams are concerned about their ability to fix any issues that might arise if they rely more on automation platforms. In fact, nearly half of respondents (48%) said visibility and transparency would most increase their trust in their automation. A quarter (25%) cited a desire for proven guardrails, followed closely by instant rollback capabilities (23%).
In total, 54% of respondents are working for organizations that have more than 100 Kubernetes clusters. However, the number of clusters may not reflect the true scope of the complexity challenge, said Rajabi. Many larger clusters could still be running thousands of workloads that are all accessing the same CPU and memory resources, she added.
Ultimately, most IT teams are going to rely more on automation if for no other reason than the simple fact that the cost of doing nothing is too high, said Rajabi. Processor and memory resources that are under utilized have become a significant total cost of ownership (TCO) problem. Resolving that issue ultimately will require a level of trust in automation that goes beyond the initial deployment of a workload, noted Rajabi.
It’s not clear to what degree generative artificial intelligence (AI) might one day be applied to the management of Kubernetes clusters. CloudBolt, for example, makes use of generative AI to provide a chat interface, but the underlying automation is based on predictive machine learning algorithms that generate a more consistent set of reliable outputs for automating IT operations, said Rajabi.
Regardless of the approach, the types of workloads being deployed on Kubernetes clusters are becoming more diverse. There is a clear need to rely more on automation to manage those application environments but cultural transitions take time. Many IT professionals are simply hesitant to make changes to complex IT platforms such as Kubernetes simply because once a cluster has been successfully deployed they don’t want to fix something that, no matter how suboptimal, at least has the benefit of still running. Alas, when it comes to expectations today, making sure a Kubernetes cluster is available is no longer simply good enough.


