Deploying Docker AI Agents on OCI and OKE

May 13, 2026 Pavan Madduri Agentic Architecture, AI, AI agents, containerization, Data Minimization, docker, Event-Driven Autoscaling, GitOps, infrastructure as code, kagent, KEDA, Kubernetes CRD, Kyverno, LLM Inference, MCP server, Model Context Protocol, oci, OCI Generative AI, OCI Vault, OCIR, OKE, OpenTelemetry, Oracle Kubernetes Engine, Production Workloads., Terraform, Virtual Nodes, Zero-Trust Security

by Pavan Madduri

AI agents are no longer experimental sidecars in the development pipeline — they are becoming first-class production workloads. As enterprises adopt agentic architectures, platform engineers face a concrete challenge: How do you containerize, secure, orchestrate and scale AI agents on enterprise-grade cloud infrastructure?

Oracle Cloud Infrastructure (OCI) and Oracle Kubernetes Engine (OKE) provide a compelling, integrated platform for running Docker-based AI agents at scale. This guide covers the full deployment life cycle, including containerizing an AI agent with Docker, pushing to OCI Container Registry (OCIR), deploying onto OKE and wiring in OCI Generative AI and OCI Vault for secure, production-ready agentic workloads.

The OCI + OKE Architecture for AI Agents

OKE is Oracle’s managed Kubernetes service, providing scalability, security and native integration across the OCI ecosystem, including OCI Generative AI, OCI Vault, OCI Container Registry and OCI IAM. Running Docker AI agents on OKE means you get the portability of containers combined with the managed operational layer of Kubernetes, without needing to provision or maintain the underlying control plane.

The canonical architecture for Docker AI agents on OKE follows this topology:

User / API Client
       │
▼
Kubernetes Service (LoadBalancer)
       │
       ▼
AI Agent Pod (Docker Container)
       │                    │
       ▼                    ▼
OCI Generative AI      OCI Vault
(LLM Inference)        (API Key Storage)
       │
       ▼
External Tools / MCP Servers / RAG Pipeline

The AI agent container handles orchestration logic — prompt construction, tool invocation, memory management and response formatting. OCI Generative AI provides the LLM inference endpoint via an OpenAI-compatible API, while OCI Vault manages secrets, so credentials never appear in container images or Kubernetes manifests.

Why OKE for AI Agent Workloads

OKE provides several capabilities that are specifically advantageous for agentic workloads:

Virtual Nodes (Serverless Kubernetes): Removes the need to manage worker node pools. AI agent pods spin up on demand on OCI’s serverless compute layer, eliminating idle infrastructure costs for bursty workloads.

OCI Generative AI Integration: Access to Cohere Command R+, Meta Llama 3 and other foundation models via a zero-data-retention, OpenAI-compatible endpoint — no GPU provisioning required.

kagent Framework: OKE natively supports kagent, a Kubernetes-native AI agent framework that defines agents as Kubernetes custom resources, enabling GitOps-driven agent deployment and life cycle management.

OCI Vault for Secrets: Native secret injection into pods eliminates hardcoded API keys.

MCP Server Support: OKE supports deploying model context protocol (MCP) servers as standalone Kubernetes workloads, enabling standardized tool discovery for AI agents.

Step 1: Build Your Docker AI Agent Image

Start with a minimal, hardened base image. For Python-based AI agents using LangChain, LlamaIndex or a custom agentic loop:

# Dockerfile
FROM python:3.11-slim

# Security: non-root user
RUN groupadd -r agentuser && useradd -r -g agentuser agentuser

WORKDIR /app

# Install dependencies first (layer caching optimization)
COPY requirements.txt .
RUN pip install –no-cache-dir -r requirements.txt

# Copy application code
COPY –chown=agentuser:agentuser . .

# Security: read-only filesystem, drop capabilities at runtime
USER agentuser

# Health check for Kubernetes liveness probes
HEALTHCHECK –interval=30s –timeout=10s –start-period=5s –retries=3 \
CMD python -c “import httpx; httpx.get(‘http://localhost:8000/health’).raise_for_status()”

EXPOSE 8000

CMD [“uvicorn”, “agent.main:app”, “–host”, “0.0.0.0”, “–port”, “8000”]

Key security hardening rules applicable to all AI agent containers:

Run as a non-root user — never run as UID 0 in production

Use –no-cache-dir for pip to minimize image size and reduce CVE surface

Declare a HEALTHCHECK so Kubernetes’ liveness and readiness probes have a target

Avoid embedding API keys or model endpoints — inject via OCI Vault at runtime

Use minimal base images (python:3.11-slim over python:3.11)

A sample requirements.txt for an OCI Generative AI agent:

fastapi==0.111.0
uvicorn==0.30.1
langchain==0.2.0
langchain-community==0.2.0
openai==1.30.0 # OCI GenAI uses OpenAI-compatible endpoints
httpx==0.27.0
pydantic==2.7.0
oci==2.126.0 # OCI Python SDK for Vault integration

Step 2: Push to OCI Container Registry

OCIR is OCI’s managed container registry, integrated with OKE for pull-through authentication via IAM instance principals.

# Authenticate to OCIR
OCI_NAMESPACE=$(oci os ns get –query ‘data’ –raw-output)
REGION=”us-chicago-1″ # Update to your OCI region

docker login ${REGION}.ocir.io \
-u “${OCI_NAMESPACE}/<your-oci-username>”

# Build for linux/amd64 (required for OKE virtual nodes)
docker build –platform linux/amd64 \
-t ${REGION}.ocir.io/${OCI_NAMESPACE}/ai-agent:latest .

# Push to OCIR
docker push ${REGION}.ocir.io/${OCI_NAMESPACE}/ai-agent:latest

Platform Note: Always build with –platform linux/amd64 when targeting OKE virtual nodes, even if your local machine is ARM-based (Apple Silicon). OKE virtual nodes run on OCI’s x86-64 compute fleet.

Step 3: Provision OKE and Configure Secrets

Create an OKE Cluster (Terraform)

# main.tf — minimal OKE cluster for AI agent workloads
resource “oci_containerengine_cluster” “ai_agent_cluster” {
compartment_id     = var.compartment_id
kubernetes_version = “v1.29.1”
name               = “ai-agent-cluster”
vcn_id             = oci_core_vcn.agent_vcn.id

cluster_pod_network_options {
cni_type = “OCI_VCN_IP_NATIVE”
}

endpoint_config {
is_public_ip_enabled = false # Private endpoint — route via bastion
}
}

resource “oci_containerengine_virtual_node_pool” “agent_pool” {
cluster_id         = oci_containerengine_cluster.ai_agent_cluster.id
compartment_id     = var.compartment_id
display_name       = “agent-virtual-nodes”
kubernetes_version = “v1.29.1”
size               = 3

virtual_node_tags {
    defined_tags = {
      “Operations.CostCenter” = “ai-platform”
    }
}
}

Store OCI Generative AI API Key in OCI Vault

# Create a secret in OCI Vault
oci vault secret create-base64 \
–compartment-id <compartment-ocid> \
–vault-id <vault-ocid> \
–key-id <key-ocid> \
–secret-name “oci-genai-api-key” \
–secret-content-content $(echo -n “<your-api-key>” | base64)

Create the OCIR Pull Secret in Kubernetes

# Configure kubectl for OKE
oci ce cluster create-kubeconfig \
–cluster-id <cluster-ocid> \
–file ~/.kube/config \
–region us-chicago-1 \
–token-version 2.0.0

# Create OCIR pull secret
kubectl create namespace ai-agents

OCI_NAMESPACE=$(oci os ns get –query ‘data’ –raw-output)
kubectl create secret docker-registry ocir-secret \
–docker-server=us-chicago-1.ocir.io \
–docker-username=”${OCI_NAMESPACE}/<your-oci-username>” \
–docker-password='<ocir-auth-token>’ \
–docker-email='<your-email>’ \
–namespace ai-agents

Step 4: Deploy the AI Agent to OKE

# ai-agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ai-agent
namespace: ai-agents
labels:
    app: ai-agent
    version: “1.0”
spec:
replicas: 2
selector:
    matchLabels:
      app: ai-agent
template:
    metadata:
      labels:
        app: ai-agent
    spec:
      serviceAccountName: ai-agent-sa   # Bound to OCI IAM for Vault access
      imagePullSecrets:
        – name: ocir-secret
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        seccompProfile:
          type: RuntimeDefault
      containers:
        – name: ai-agent
          image: us-chicago-1.ocir.io/<namespace>/ai-agent:latest
          ports:
            – containerPort: 8000
          env:
            – name: OCI_GENAI_ENDPOINT
              value: “https://inference.generativeai.us-chicago-1.oci.oraclecloud.com”
            – name: OCI_GENAI_MODEL_ID
              value: “cohere.command-r-plus”
            – name: OCI_VAULT_SECRET_ID
              valueFrom:
                secretKeyRef:
                  name: agent-secrets
                  key: vault-secret-id
          resources:
            requests:
              memory: “512Mi”
              cpu: “250m”
            limits:
              memory: “1Gi”
              cpu: “1000m”
          livenessProbe:
            httpGet:
              path: /health
              port: 8000
            initialDelaySeconds: 15
            periodSeconds: 20
          readinessProbe:
            httpGet:
              path: /ready
              port: 8000
            initialDelaySeconds: 5
            periodSeconds: 10
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                – ALL
      volumes:
        – name: tmp-volume
          emptyDir: {}
—
apiVersion: v1
kind: Service
metadata:
name: ai-agent-service
namespace: ai-agents
spec:
selector:
    app: ai-agent
ports:
    – protocol: TCP
      port: 80
      targetPort: 8000
type: LoadBalancer   # OCI provisions a public load balancer automatically
—
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ai-agent-hpa
namespace: ai-agents
spec:
scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-agent
minReplicas: 2
maxReplicas: 10
metrics:
    – type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Apply the Manifests

kubectl apply -f ai-agent-deployment.yaml

# Verify deployment
kubectl get pods -n ai-agents
kubectl get svc -n ai-agents
kubectl logs -f deployment/ai-agent -n ai-agents

Step 5: Wire in OCI Generative AI

OCI Generative AI exposes an OpenAI-compatible API endpoint, which means standard LangChain or OpenAI SDK integrations work without modification — simply point the base URL at the OCI endpoint.

# agent/oci_genai_client.py
import os
from openai import OpenAI

def get_oci_genai_client() -> OpenAI:
    “””
    Returns an OpenAI-compatible client pointed at OCI Generative AI.
    API key is retrieved from OCI Vault at runtime via the OCI Python SDK.
    “””
    api_key = _retrieve_vault_secret(
        secret_id=os.environ[“OCI_VAULT_SECRET_ID”]
    )
    return OpenAI(
        api_key=api_key,
        base_url=os.environ[“OCI_GENAI_ENDPOINT”] + “/20231130”,
    )

def _retrieve_vault_secret(secret_id: str) -> str:
    import oci
    config = oci.config.from_file()
    client = oci.secrets.SecretsClient(config)
    response = client.get_secret_bundle(secret_id=secret_id)
    import base64
    return base64.b64decode(
        response.data.secret_bundle_content.content
    ).decode(“utf-8”)

# Usage in agent loop
client = get_oci_genai_client()
response = client.chat.completions.create(
    model=os.environ[“OCI_GENAI_MODEL_ID”],
    messages=[
        {“role”: “system”, “content”: “You are an infrastructure operations assistant.”},
        {“role”: “user”, “content”: user_message}
    ],
    temperature=0.2,
    max_tokens=2048
)

Step 6: Deploy kagent for Kubernetes-Native Agent Orchestration

kagent is the Kubernetes-native AI agent framework built for OKE. It defines agents as Kubernetes custom resources (CRDs), enabling GitOps-driven agent life cycle management — agents become deployable, versioned, auditable Kubernetes objects.

# Install kagent on OKE
helm repo add kagent https://kagent-dev.github.io/kagent/helm
helm repo update

helm install kagent kagent/kagent \
–namespace kagent \
–create-namespace \
–set oci.genai.endpoint=”${OCI_GENAI_ENDPOINT}” \
–set oci.vault.secretId=”${VAULT_SECRET_ID}”

Define an Agent as a Kubernetes Custom Resource

# infrastructure-agent.yaml
apiVersion: kagent.dev/v1alpha1
kind: Agent
metadata:
name: infrastructure-remediation-agent
namespace: ai-agents
spec:
modelConfig:
    provider: oci-genai
    model: cohere.command-r-plus
systemPrompt: |
    You are an SRE operations agent with read-only access to Kubernetes cluster state.
    You diagnose incidents and propose remediation steps as Kubernetes manifests.
    You never apply changes directly — always submit a pull request.
tools:
    – name: kubectl-readonly
      type: kubernetes
      permissions: read-only
    – name: prometheus-query
      type: http
      url: “http://prometheus.monitoring.svc.cluster.local:9090”
rbacPolicy:
    allowedNamespaces:
      – frontend
      – api-gateway
    deniedNamespaces:
      – payments
      – secrets-management

kubectl apply -f infrastructure-agent.yaml
kubectl get agents -n ai-agents

Security Architecture: Zero-Trust for AI Agents on OKE

AI agent pods require a specific zero-trust posture. Apply these controls at all layers:

Layer	Control	Implementation
Container	Non-root execution, read-only filesystem	securityContext in pod spec
Kubernetes RBAC	Least-privilege service accounts	Namespace-scoped roles only
Network	Deny all ingress/egress by default	Kubernetes NetworkPolicy + OCI Security Lists
Secrets	No credentials in manifests or images	OCI Vault + Kubernetes External Secrets Operator
Image Supply Chain	Signed images only	OCI Container Registry + Cosign verification
Policy Enforcement	Admission control	Kyverno policies on all agent namespaces
Sandboxing	Syscall-level isolation for untrusted agent code	Kata Containers on OKE for multitenant scenarios

Kyverno Policy: Block Privileged AI Agent Containers

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-ai-agent-privileges
spec:
validationFailureAction: Enforce
rules:
    – name: no-privileged-containers
      match:
        resources:
          kinds:
            – Pod
          namespaces:
            – ai-agents
      validate:
        message: “AI agent containers must not run as privileged or root.”
        pattern:
          spec:
            containers:
              – securityContext:
                  runAsNonRoot: true
                  allowPrivilegeEscalation: false

KEDA-Based Event-Driven Autoscaling for AI Agents

Standard HPA scales on CPU — insufficient for AI agent workloads that are queue-depth-driven. Kubernetes Event-Driven Autoscaling (KEDA) scales agent replicas based on actual work queue depth.

# keda-scaledobject.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: ai-agent-scaler
namespace: ai-agents
spec:
scaleTargetRef:
    name: ai-agent
minReplicaCount: 1
maxReplicaCount: 20
triggers:
    – type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
        metricName: agent_queue_depth
        threshold: “5”
        query: sum(agent_pending_requests{namespace=”ai-agents”})

This ensures agent pods scale to zero during idle periods (eliminating compute waste) and burst to handle peak inference demand within seconds.

MCP Server Deployment on OKE

MCP standardizes how AI agents discover and invoke external tools. Deploy an MCP server as a standalone OKE workload to give your agents structured access to APIs, databases and internal services.

# Build and push MCP server image
docker build –platform linux/amd64 -t ${REGION}.ocir.io/${OCI_NAMESPACE}/mcp-server:latest ./mcp-server
docker push ${REGION}.ocir.io/${OCI_NAMESPACE}/mcp-server:latest

# Deploy MCP server
kubectl apply -f mcp-server/k8s/manifest.yaml -n ai-agents

# Configure AI agent to use MCP endpoint
kubectl set env deployment/ai-agent \
MCP_SERVER_URL=”http://mcp-server.ai-agents.svc.cluster.local:8080″ \
-n ai-agents

Observability: OpenTelemetry for AI Agent Workloads

Standard Kubernetes metrics are insufficient for AI agent observability. Agent workloads require tracing the full reasoning chain — not just HTTP latency.

# agent/telemetry.py
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

def configure_telemetry():
    provider = TracerProvider()
    exporter = OTLPSpanExporter(
        endpoint=”http://otel-collector.monitoring.svc.cluster.local:4317″
    )
    provider.add_span_processor(BatchSpanProcessor(exporter))
    trace.set_tracer_provider(provider)

tracer = trace.get_tracer(“ai-agent”)

# Instrument agent tool calls
with tracer.start_as_current_span(“agent.tool_call”) as span:
    span.set_attribute(“agent.tool”, tool_name)
    span.set_attribute(“agent.model”, model_id)
    span.set_attribute(“agent.tokens_used”, token_count)
    span.set_attribute(“agent.namespace”, target_namespace)
    result = tool.execute(payload)

Key Metrics to Instrument for AI Agent Workloads:

agent.inference_latency_ms — LLM call round-trip time

agent.tool_call_count — Number of external tool invocations per agent run

agent.token_usage — Input and output token counts (directly correlates to cost)

agent.queue_depth — Pending agent requests (drives KEDA scaling)

agent.error_rate — Failed tool calls or hallucinated commands caught by policy

Production Readiness Checklist

Before promoting any Docker AI agent workload to production on OKE, verify the following:

Container Security

[ ] Non-root user, read-only root filesystem, all Linux capabilities dropped

[ ] Image signed with Cosign and verified at admission via Kyverno

[ ] No secrets in Dockerfile, environment variables sourced from OCI Vault

[ ] Base image scanned for CVEs (Oracle Container Scanner or Grype)

Kubernetes Configuration

[ ] Resource requests and limits defined on all agent containers

[ ] Liveness and readiness probes configured

[ ] Pod Disruption Budget defined for multi-replica deployments

[ ] Namespace-scoped RBAC — no cluster-admin service accounts

[ ] NetworkPolicy denying default ingress/egress with explicit allow rules

Scaling and Availability

[ ] KEDA ScaledObject configured for queue-depth-based autoscaling

[ ] Minimum two replicas for production agents (anti-affinity rules applied)

[ ] OKE virtual nodes configured for cost-optimized burst capacity

Observability

[ ] OpenTelemetry instrumented for reasoning chain tracing

[ ] Prometheus metrics exported: Token usage, latency, queue depth, error rate

[ ] Alerting rules configured for agent error rate spikes and runaway token consumption

GitOps

[ ] All manifests version-controlled in Git

[ ] ArgoCD application syncing from main branch

[ ] Branch protection enabled — no direct commits to main

Conclusion

OCI and OKE provide a production-grade, cloud-native foundation for Docker AI agents that covers the full stack: Managed Kubernetes orchestration, serverless compute scaling, integrated LLM inference via OCI Generative AI, secure secrets management through OCI Vault and Kubernetes-native agent frameworks through kagent.

The architecture described in this guide — containerized agent pods, KEDA-based autoscaling, OCI Vault secret injection, Kyverno admission control and OpenTelemetry observability — reflects the same zero-trust, GitOps-anchored patterns required for any production-grade microservice, applied specifically to the constraints of agentic AI workloads.

The platform engineering imperative is clear: AI agents must be treated as first-class infrastructure citizens, not experimental scripts wrapped in containers. The tooling is ready. The patterns are proven. The deployment is yours to build.