LiteLLM Unveils Agent Platform: K8s-Based Infrastructure for Production AI Agents

While running AI agents in local scripts is a straightforward task, deploying them reliably in production across multiple teams and restarts presents a significant challenge. BerriAI, the creator of the LiteLLM AI Gateway, has now open-sourced the LiteLLM Agent Platform, a purpose-built infrastructure layer designed to handle isolated agent sandboxes and persistent session management.

The platform addresses the critical issue of statefulness in agents. AI agents carry session history, tool call results, and intermediate reasoning across multiple turns. In a standard production environment, if a container crashes or is replaced during a deployment, this session state is lost unless explicitly managed. Furthermore, different teams require distinct runtime environments, tools, and security scopes, making a single shared container impractical for diverse agent workloads.

Technically, the platform provides two essential infrastructure primitives: per-team/per-context sandboxes and session continuity across pod restarts and upgrades. It features a standalone Next.js dashboard for managing LiteLLM v2 agents, covering chat sessions, agent CRUD operations, and live status monitoring. The stack is primarily built with TypeScript (92.8%), using Postgres as a persistent store and employing init containers for database migrations to ensure the environment is ready before the application boots.

For the sandbox layer—the isolated runtime environment where agents actually execute—the platform runs on Kubernetes via the kubernetes-sigs/agent-sandbox CRD (Custom Resource Definition). Local development is supported through kind (Kubernetes in Docker), allowing developers to spin up a full cluster locally using Docker containers as nodes.

The platform also includes a harness system under harnesses/opencode, configured for running coding agents like Claude Code or OpenAI Codex inside isolated sandboxes with a vault proxy for credential management. Additionally, BerriAI maintains the litellm-agent-runtime, a generic runtime designed to run inside per-session VMs provisioned by a LiteLLM proxy, allowing for deep customization via harness configurations.

LiteLLM Unveils Agent Platform: K8s-Based Infrastructure for Production AI Agents

Next Stories to Read

GPT-5.4 Detects Hidden Prompt Hook Bugs in Claude Code’s Harness

Claude Code v2.1.143: Fast Mode Upgraded to Opus 4.7 with Enhanced Plugin Management

Anthropic and Gates Foundation Commit $200M for AI in Developing Nations