KubeCon + CloudNativeCon North America 2025
Devtron Adds AI Agents to SRE Platform for Kubernetes Environments
Devtron today revealed it has added artificial intelligence (AI) agents to its open source platform for automating site reliability engineering (SRE) workflows across Kubernetes environments. Announced at the Kubecon + CloudNativeCon North ...
GPU Resource Management for Kubernetes Workloads: From Monolithic Allocation to Intelligent Sharing
AI and ML workloads in Kubernetes are evolving fast—but traditional GPU allocation leads to massive waste and inefficiency. Learn how intelligent GPU allocation, leveraging technologies like MIG, MPS, and time-slicing, enables smarter, ...
Ashfaq Munshi | | AI infrastructure optimization, AI workload orchestration, AI/ML GPU efficiency, GPU cost efficiency, GPU efficiency in AI workloads, GPU overprovisioning, GPU partitioning technologies, GPU resource allocation strategies, GPU resource management, GPU sharing in Kubernetes, GPU time-slicing, GPU utilization optimization, GPU workload rightsizing, intelligent GPU allocation, Kubernetes AI workloads, Kubernetes GPU performance, Kubernetes GPU scheduling, multi-instance GPU, multi-process service, NVIDIA MIG, NVIDIA MPS
Why Agentic SREs Require Active Telemetry in Kubernetes
Discover how Active Telemetry enables Agentic SREs to move from reactive firefighting to autonomous diagnosis and proactive reliability in Kubernetes ...
Tucker Callaway | | Active Telemetry, Active Telemetry pipeline, Agentic SRE, AI infrastructure, AI observability, AI-driven SRE, autonomous diagnosis, autonomous operations, cloud native operations, context engineering, data context, intelligent observability, KubeCon 2025, Kubernetes reliability, MTTR reduction, operational autonomy, proactive remediation, root cause analysis, site reliability engineering, telemetry architecture
How Distroless Containers Defend Against npm Malware Attacks
The npm breach shows why distroless containers matter. Learn how minimal, continuously rebuilt images strengthen cloud-native supply-chain security ...
Dhanush V M | | CleanStart, cloud native security, container hardening, container security, DevSecOps, distroless best practices, distroless containers, KubeCon 2025, Kubernetes security, malware prevention, minimal container images, npm attack, open source security, phishing attack, SBOM, secure build pipelines, secure software delivery, SLSA compliance, software supply chain security, vulnerability remediation
Why Traditional Kubernetes Security Falls Short for AI Workloads
AI workloads on Kubernetes bring new security risks. Learn five principles—zero trust, observability, and policy-as-code—to protect distributed AI pipelines ...
Ratan Tipirneni | | AI infrastructure, AI security, AI Workloads, cloud native AI, cloud native security, container security, data protection, DevSecOps, edge AI, GPU workloads, KubeCon 2025, kubernetes, Kubernetes observability, Kubernetes security, microsegmentation, multi-cluster security, policy as code, runtime protection, Spectro Cloud report, zero-trust
When “Healthy” Isn’t Healthy: Rethinking Kubernetes Health Checks for Real-World Systems
Kubernetes health checks often miss real issues. Learn how to design smarter, context-aware probes that reflect true application health and prevent downtime ...
Nick Taylor | | application state, cloud-native reliability, cluster health, context-aware health, devops best practices, distributed systems, KubeCon 2025, kubernetes, Kubernetes health checks, Kubernetes monitoring, Kubernetes troubleshooting, liveness probes, readiness probes, self-healing systems, startup probes
It Worked Last Tuesday: What Operators Teach Us About Platform Reality
Infrastructure as code defined the cloud era, but Kubernetes operators are redefining how DevOps keeps systems reliable. Instead of “apply and hope,” operators continuously reconcile reality with intent — automating change, reducing ...
Avery Pennarun | | Atlanta, automation, CI/CD, cloud infrastructure, cloud native, cloud operations, CloudNativeCon 2025, cluster management, configuration management, continuous delivery, control loops, declarative infrastructure, DevOps automation, DevOps culture, GitOps, IaC, infrastructure as code, intent-based automation, KubeCon 2025, kubernetes, kubernetes best practices, Kubernetes controller, Kubernetes operators, Kubernetes reconciliation loop, microservices, observability, operational excellence, operator pattern, platform engineering, platform stability, reconciliation, resilience engineering, self-healing systems, service reliability, SRE

