Komodor Launches Extensible Multi-Agent Architecture for AI-Driven Site Reliability Engineering
Komodor announced a new extensibility framework that transforms its Klaudia AI technology into a universal multi-agent platform for troubleshooting and optimizing complex cloud native infrastructures and applications.
The new multi-agent orchestration capabilities enable teams to automate investigation and remediation of operational issues across infrastructure layers including Kubernetes, GPUs, networking and storage. Organizations can extend Klaudia AI with their own tools, services and agents via MCP or an OpenAPI specification, combining them with more than 50 specialized agents already provided by Komodor.
The platform uses workflow agents that coordinate key reliability engineering processes including detection, investigation and remediation. These workflow agents dynamically invoke specialized Subject Matter Expert Agents for deep expertise in specific technologies or domains, retrieving precise context to avoid hallucinations and data overload.
“Most AI tools for operations focus on summarizing telemetry rather than resolving incidents, but complex outages require specialists from multiple domains working together to understand what’s happening across the stack,” said Itiel Shwartz, co-founder and CTO of Komodor. “The Komodor platform’s new extensible architecture replicates this collaborative process using specialized agents that encode operational knowledge and work together to diagnose and resolve issues.”
The multi-agent framework for Klaudia AI is available immediately from Komodor and its business partners worldwide.


