Deploying Docker AI Agents on OCI and OKE
AI agents are no longer experimental sidecars in the development pipeline — they are becoming first-class production workloads. As enterprises adopt agentic architectures, platform engineers face a concrete challenge: How do you containerize, secure, orchestrate and scale AI agents on enterprise-grade cloud infrastructure?
Oracle Cloud Infrastructure (OCI) and Oracle Kubernetes Engine (OKE) provide a compelling, integrated platform for running Docker-based AI agents at scale. This guide covers the full deployment life cycle, including containerizing an AI agent with Docker, pushing to OCI Container Registry (OCIR), deploying onto OKE and wiring in OCI Generative AI and OCI Vault for secure, production-ready agentic workloads.
The OCI + OKE Architecture for AI Agents
OKE is Oracle’s managed Kubernetes service, providing scalability, security and native integration across the OCI ecosystem, including OCI Generative AI, OCI Vault, OCI Container Registry and OCI IAM. Running Docker AI agents on OKE means you get the portability of containers combined with the managed operational layer of Kubernetes, without needing to provision or maintain the underlying control plane.
The canonical architecture for Docker AI agents on OKE follows this topology:
User / API Client
│
▼
Kubernetes Service (LoadBalancer)
│
▼
AI Agent Pod (Docker Container)
│ │
▼ ▼
OCI Generative AI OCI Vault
(LLM Inference) (API Key Storage)
│
▼
External Tools / MCP Servers / RAG Pipeline
The AI agent container handles orchestration logic — prompt construction, tool invocation, memory management and response formatting. OCI Generative AI provides the LLM inference endpoint via an OpenAI-compatible API, while OCI Vault manages secrets, so credentials never appear in container images or Kubernetes manifests.
Why OKE for AI Agent Workloads
OKE provides several capabilities that are specifically advantageous for agentic workloads:
- Virtual Nodes (Serverless Kubernetes): Removes the need to manage worker node pools. AI agent pods spin up on demand on OCI’s serverless compute layer, eliminating idle infrastructure costs for bursty workloads.
- OCI Generative AI Integration: Access to Cohere Command R+, Meta Llama 3 and other foundation models via a zero-data-retention, OpenAI-compatible endpoint — no GPU provisioning required.
- kagent Framework: OKE natively supports kagent, a Kubernetes-native AI agent framework that defines agents as Kubernetes custom resources, enabling GitOps-driven agent deployment and life cycle management.
- OCI Vault for Secrets: Native secret injection into pods eliminates hardcoded API keys.
- MCP Server Support: OKE supports deploying model context protocol (MCP) servers as standalone Kubernetes workloads, enabling standardized tool discovery for AI agents.
Step 1: Build Your Docker AI Agent Image
Start with a minimal, hardened base image. For Python-based AI agents using LangChain, LlamaIndex or a custom agentic loop:
# Dockerfile
FROM python:3.11-slim
# Security: non-root user
RUN groupadd -r agentuser && useradd -r -g agentuser agentuser
WORKDIR /app
# Install dependencies first (layer caching optimization)
COPY requirements.txt .
RUN pip install –no-cache-dir -r requirements.txt
# Copy application code
COPY –chown=agentuser:agentuser . .
# Security: read-only filesystem, drop capabilities at runtime
USER agentuser
# Health check for Kubernetes liveness probes
HEALTHCHECK –interval=30s –timeout=10s –start-period=5s –retries=3 \
CMD python -c “import httpx; httpx.get(‘http://localhost:8000/health’).raise_for_status()”
EXPOSE 8000
CMD [“uvicorn”, “agent.main:app”, “–host”, “0.0.0.0”, “–port”, “8000”]
Key security hardening rules applicable to all AI agent containers:
- Run as a non-root user — never run as UID 0 in production
- Use –no-cache-dir for pip to minimize image size and reduce CVE surface
- Declare a HEALTHCHECK so Kubernetes’ liveness and readiness probes have a target
- Avoid embedding API keys or model endpoints — inject via OCI Vault at runtime
- Use minimal base images (python:3.11-slim over python:3.11)
A sample requirements.txt for an OCI Generative AI agent:
fastapi==0.111.0
uvicorn==0.30.1
langchain==0.2.0
langchain-community==0.2.0
openai==1.30.0 # OCI GenAI uses OpenAI-compatible endpoints
httpx==0.27.0
pydantic==2.7.0
oci==2.126.0 # OCI Python SDK for Vault integration
Step 2: Push to OCI Container Registry
OCIR is OCI’s managed container registry, integrated with OKE for pull-through authentication via IAM instance principals.
# Authenticate to OCIR
OCI_NAMESPACE=$(oci os ns get –query ‘data’ –raw-output)
REGION=”us-chicago-1″ # Update to your OCI region
docker login ${REGION}.ocir.io \
-u “${OCI_NAMESPACE}/<your-oci-username>”
# Build for linux/amd64 (required for OKE virtual nodes)
docker build –platform linux/amd64 \
-t ${REGION}.ocir.io/${OCI_NAMESPACE}/ai-agent:latest .
# Push to OCIR
docker push ${REGION}.ocir.io/${OCI_NAMESPACE}/ai-agent:latest
Platform Note: Always build with –platform linux/amd64 when targeting OKE virtual nodes, even if your local machine is ARM-based (Apple Silicon). OKE virtual nodes run on OCI’s x86-64 compute fleet.
Step 3: Provision OKE and Configure Secrets
Create an OKE Cluster (Terraform)
# main.tf — minimal OKE cluster for AI agent workloads
resource “oci_containerengine_cluster” “ai_agent_cluster” {
compartment_id = var.compartment_id
kubernetes_version = “v1.29.1”
name = “ai-agent-cluster”
vcn_id = oci_core_vcn.agent_vcn.id
cluster_pod_network_options {
cni_type = “OCI_VCN_IP_NATIVE”
}
endpoint_config {
is_public_ip_enabled = false # Private endpoint — route via bastion
}
}
resource “oci_containerengine_virtual_node_pool” “agent_pool” {
cluster_id = oci_containerengine_cluster.ai_agent_cluster.id
compartment_id = var.compartment_id
display_name = “agent-virtual-nodes”
kubernetes_version = “v1.29.1”
size = 3
virtual_node_tags {
defined_tags = {
“Operations.CostCenter” = “ai-platform”
}
}
}
Store OCI Generative AI API Key in OCI Vault
# Create a secret in OCI Vault
oci vault secret create-base64 \
–compartment-id <compartment-ocid> \
–vault-id <vault-ocid> \
–key-id <key-ocid> \
–secret-name “oci-genai-api-key” \
–secret-content-content $(echo -n “<your-api-key>” | base64)
Create the OCIR Pull Secret in Kubernetes
# Configure kubectl for OKE
oci ce cluster create-kubeconfig \
–cluster-id <cluster-ocid> \
–file ~/.kube/config \
–region us-chicago-1 \
–token-version 2.0.0
# Create OCIR pull secret
kubectl create namespace ai-agents
OCI_NAMESPACE=$(oci os ns get –query ‘data’ –raw-output)
kubectl create secret docker-registry ocir-secret \
–docker-server=us-chicago-1.ocir.io \
–docker-username=”${OCI_NAMESPACE}/<your-oci-username>” \
–docker-password='<ocir-auth-token>’ \
–docker-email='<your-email>’ \
–namespace ai-agents
Step 4: Deploy the AI Agent to OKE
# ai-agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ai-agent
namespace: ai-agents
labels:
app: ai-agent
version: “1.0”
spec:
replicas: 2
selector:
matchLabels:
app: ai-agent
template:
metadata:
labels:
app: ai-agent
spec:
serviceAccountName: ai-agent-sa # Bound to OCI IAM for Vault access
imagePullSecrets:
– name: ocir-secret
securityContext:
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
containers:
– name: ai-agent
image: us-chicago-1.ocir.io/<namespace>/ai-agent:latest
ports:
– containerPort: 8000
env:
– name: OCI_GENAI_ENDPOINT
value: “https://inference.generativeai.us-chicago-1.oci.oraclecloud.com”
– name: OCI_GENAI_MODEL_ID
value: “cohere.command-r-plus”
– name: OCI_VAULT_SECRET_ID
valueFrom:
secretKeyRef:
name: agent-secrets
key: vault-secret-id
resources:
requests:
memory: “512Mi”
cpu: “250m”
limits:
memory: “1Gi”
cpu: “1000m”
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
– ALL
volumes:
– name: tmp-volume
emptyDir: {}
—
apiVersion: v1
kind: Service
metadata:
name: ai-agent-service
namespace: ai-agents
spec:
selector:
app: ai-agent
ports:
– protocol: TCP
port: 80
targetPort: 8000
type: LoadBalancer # OCI provisions a public load balancer automatically
—
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ai-agent-hpa
namespace: ai-agents
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ai-agent
minReplicas: 2
maxReplicas: 10
metrics:
– type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Apply the Manifests
kubectl apply -f ai-agent-deployment.yaml
# Verify deployment
kubectl get pods -n ai-agents
kubectl get svc -n ai-agents
kubectl logs -f deployment/ai-agent -n ai-agents
Step 5: Wire in OCI Generative AI
OCI Generative AI exposes an OpenAI-compatible API endpoint, which means standard LangChain or OpenAI SDK integrations work without modification — simply point the base URL at the OCI endpoint.
# agent/oci_genai_client.py
import os
from openai import OpenAI
def get_oci_genai_client() -> OpenAI:
“””
Returns an OpenAI-compatible client pointed at OCI Generative AI.
API key is retrieved from OCI Vault at runtime via the OCI Python SDK.
“””
api_key = _retrieve_vault_secret(
secret_id=os.environ[“OCI_VAULT_SECRET_ID”]
)
return OpenAI(
api_key=api_key,
base_url=os.environ[“OCI_GENAI_ENDPOINT”] + “/20231130”,
)
def _retrieve_vault_secret(secret_id: str) -> str:
import oci
config = oci.config.from_file()
client = oci.secrets.SecretsClient(config)
response = client.get_secret_bundle(secret_id=secret_id)
import base64
return base64.b64decode(
response.data.secret_bundle_content.content
).decode(“utf-8”)
# Usage in agent loop
client = get_oci_genai_client()
response = client.chat.completions.create(
model=os.environ[“OCI_GENAI_MODEL_ID”],
messages=[
{“role”: “system”, “content”: “You are an infrastructure operations assistant.”},
{“role”: “user”, “content”: user_message}
],
temperature=0.2,
max_tokens=2048
)
Step 6: Deploy kagent for Kubernetes-Native Agent Orchestration
kagent is the Kubernetes-native AI agent framework built for OKE. It defines agents as Kubernetes custom resources (CRDs), enabling GitOps-driven agent life cycle management — agents become deployable, versioned, auditable Kubernetes objects.
# Install kagent on OKE
helm repo add kagent https://kagent-dev.github.io/kagent/helm
helm repo update
helm install kagent kagent/kagent \
–namespace kagent \
–create-namespace \
–set oci.genai.endpoint=”${OCI_GENAI_ENDPOINT}” \
–set oci.vault.secretId=”${VAULT_SECRET_ID}”
Define an Agent as a Kubernetes Custom Resource
# infrastructure-agent.yaml
apiVersion: kagent.dev/v1alpha1
kind: Agent
metadata:
name: infrastructure-remediation-agent
namespace: ai-agents
spec:
modelConfig:
provider: oci-genai
model: cohere.command-r-plus
systemPrompt: |
You are an SRE operations agent with read-only access to Kubernetes cluster state.
You diagnose incidents and propose remediation steps as Kubernetes manifests.
You never apply changes directly — always submit a pull request.
tools:
– name: kubectl-readonly
type: kubernetes
permissions: read-only
– name: prometheus-query
type: http
url: “http://prometheus.monitoring.svc.cluster.local:9090”
rbacPolicy:
allowedNamespaces:
– frontend
– api-gateway
deniedNamespaces:
– payments
– secrets-management
kubectl apply -f infrastructure-agent.yaml
kubectl get agents -n ai-agents
Security Architecture: Zero-Trust for AI Agents on OKE
AI agent pods require a specific zero-trust posture. Apply these controls at all layers:
| Layer | Control | Implementation |
| Container | Non-root execution, read-only filesystem | securityContext in pod spec |
| Kubernetes RBAC | Least-privilege service accounts | Namespace-scoped roles only |
| Network | Deny all ingress/egress by default | Kubernetes NetworkPolicy + OCI Security Lists |
| Secrets | No credentials in manifests or images | OCI Vault + Kubernetes External Secrets Operator |
| Image Supply Chain | Signed images only | OCI Container Registry + Cosign verification |
| Policy Enforcement | Admission control | Kyverno policies on all agent namespaces |
| Sandboxing | Syscall-level isolation for untrusted agent code | Kata Containers on OKE for multitenant scenarios |
Kyverno Policy: Block Privileged AI Agent Containers
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-ai-agent-privileges
spec:
validationFailureAction: Enforce
rules:
– name: no-privileged-containers
match:
resources:
kinds:
– Pod
namespaces:
– ai-agents
validate:
message: “AI agent containers must not run as privileged or root.”
pattern:
spec:
containers:
– securityContext:
runAsNonRoot: true
allowPrivilegeEscalation: false
KEDA-Based Event-Driven Autoscaling for AI Agents
Standard HPA scales on CPU — insufficient for AI agent workloads that are queue-depth-driven. Kubernetes Event-Driven Autoscaling (KEDA) scales agent replicas based on actual work queue depth.
# keda-scaledobject.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: ai-agent-scaler
namespace: ai-agents
spec:
scaleTargetRef:
name: ai-agent
minReplicaCount: 1
maxReplicaCount: 20
triggers:
– type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
metricName: agent_queue_depth
threshold: “5”
query: sum(agent_pending_requests{namespace=”ai-agents”})
This ensures agent pods scale to zero during idle periods (eliminating compute waste) and burst to handle peak inference demand within seconds.
MCP Server Deployment on OKE
MCP standardizes how AI agents discover and invoke external tools. Deploy an MCP server as a standalone OKE workload to give your agents structured access to APIs, databases and internal services.
# Build and push MCP server image
docker build –platform linux/amd64 -t ${REGION}.ocir.io/${OCI_NAMESPACE}/mcp-server:latest ./mcp-server
docker push ${REGION}.ocir.io/${OCI_NAMESPACE}/mcp-server:latest
# Deploy MCP server
kubectl apply -f mcp-server/k8s/manifest.yaml -n ai-agents
# Configure AI agent to use MCP endpoint
kubectl set env deployment/ai-agent \
MCP_SERVER_URL=”http://mcp-server.ai-agents.svc.cluster.local:8080″ \
-n ai-agents
Observability: OpenTelemetry for AI Agent Workloads
Standard Kubernetes metrics are insufficient for AI agent observability. Agent workloads require tracing the full reasoning chain — not just HTTP latency.
# agent/telemetry.py
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
def configure_telemetry():
provider = TracerProvider()
exporter = OTLPSpanExporter(
endpoint=”http://otel-collector.monitoring.svc.cluster.local:4317″
)
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(“ai-agent”)
# Instrument agent tool calls
with tracer.start_as_current_span(“agent.tool_call”) as span:
span.set_attribute(“agent.tool”, tool_name)
span.set_attribute(“agent.model”, model_id)
span.set_attribute(“agent.tokens_used”, token_count)
span.set_attribute(“agent.namespace”, target_namespace)
result = tool.execute(payload)
Key Metrics to Instrument for AI Agent Workloads:
- agent.inference_latency_ms — LLM call round-trip time
- agent.tool_call_count — Number of external tool invocations per agent run
- agent.token_usage — Input and output token counts (directly correlates to cost)
- agent.queue_depth — Pending agent requests (drives KEDA scaling)
- agent.error_rate — Failed tool calls or hallucinated commands caught by policy
Production Readiness Checklist
Before promoting any Docker AI agent workload to production on OKE, verify the following:
Container Security
- [ ] Non-root user, read-only root filesystem, all Linux capabilities dropped
- [ ] Image signed with Cosign and verified at admission via Kyverno
- [ ] No secrets in Dockerfile, environment variables sourced from OCI Vault
- [ ] Base image scanned for CVEs (Oracle Container Scanner or Grype)
Kubernetes Configuration
- [ ] Resource requests and limits defined on all agent containers
- [ ] Liveness and readiness probes configured
- [ ] Pod Disruption Budget defined for multi-replica deployments
- [ ] Namespace-scoped RBAC — no cluster-admin service accounts
- [ ] NetworkPolicy denying default ingress/egress with explicit allow rules
Scaling and Availability
- [ ] KEDA ScaledObject configured for queue-depth-based autoscaling
- [ ] Minimum two replicas for production agents (anti-affinity rules applied)
- [ ] OKE virtual nodes configured for cost-optimized burst capacity
Observability
- [ ] OpenTelemetry instrumented for reasoning chain tracing
- [ ] Prometheus metrics exported: Token usage, latency, queue depth, error rate
- [ ] Alerting rules configured for agent error rate spikes and runaway token consumption
GitOps
- [ ] All manifests version-controlled in Git
- [ ] ArgoCD application syncing from main branch
- [ ] Branch protection enabled — no direct commits to main
Conclusion
OCI and OKE provide a production-grade, cloud-native foundation for Docker AI agents that covers the full stack: Managed Kubernetes orchestration, serverless compute scaling, integrated LLM inference via OCI Generative AI, secure secrets management through OCI Vault and Kubernetes-native agent frameworks through kagent.
The architecture described in this guide — containerized agent pods, KEDA-based autoscaling, OCI Vault secret injection, Kyverno admission control and OpenTelemetry observability — reflects the same zero-trust, GitOps-anchored patterns required for any production-grade microservice, applied specifically to the constraints of agentic AI workloads.
The platform engineering imperative is clear: AI agents must be treated as first-class infrastructure citizens, not experimental scripts wrapped in containers. The tooling is ready. The patterns are proven. The deployment is yours to build.


