LiteLLM Launches Agent Platform: A K8s-Based Infrastructure for Production

Running AI agents in a local script is straightforward. Running them reliably in production across teams, across restarts, with isolated environments per context is a different problem entirely. BerriAI, the company behind the LiteLLM AI Gateway, is now open-sourcing a purpose-built answer to that problem: the LiteLLM Agent Platform. The platform is a simple, self-hosted infrastructure layer for running multiple agents in production.

The Core Problems Solved

To understand its value, one must look at what happens when scaling agents beyond a single process. Agents are inherently stateful, carrying session history, tool call results, and intermediate reasoning across turns. If an agent-running container crashes, restarts, or is replaced during deployment, session state is lost without explicit management. Moreover, different teams require distinct runtime environments, tools, secrets, and access scopes, making shared containers unviable.

The LiteLLM Agent Platform manages two primary primitives: per-team and per-context sandboxes, and session continuity across pod restarts and upgrades.

Architecture and Technical Stack

The platform features a standalone Next.js dashboard for LiteLLM v2 managed agents, covering sessions chat, agent CRUD, and live status. The codebase is primarily TypeScript (92.8%), utilizing Shell scripts for provisioning, Docker for containerization, and Postgres as the persistent backing store. A schema migration runs as an init container on startup to ensure database readiness before the application boots.

For isolation, sandboxes run on Kubernetes via the kubernetes-sigs/agent-sandbox CRD. Local development relies on kind (Kubernetes in Docker). The platform also includes a harness system under harnesses/opencode for running coding agents (like Claude Code or OpenAI Codex) in isolated sandboxes with a vault proxy for credential management. Additionally, BerriAI maintains the litellm-agent-runtime repository, a generic coding-agent runtime running inside per-session VMs provisioned by a LiteLLM proxy, customizable via harness configurations.

[AgentUpdate Depth Analysis] The launch of the LiteLLM Agent Platform marks a pivotal shift in the AI Agent ecosystem from application-level experimentation to cloud-native production reliability. By leveraging Kubernetes and the agent-sandbox CRD, BerriAI addresses the critical enterprise challenges of multi-tenancy, secure tool execution, and state persistence. Unlike proprietary sandbox APIs, this self-hosted K8s approach provides enterprises with full data sovereignty and cost predictability. As AI Agents transition from static chatbots to highly autonomous systems executing complex code, robust, isolated sandbox environments will become as fundamental to AI stacks as Docker containers are to modern web development. LiteLLM is strategically positioning itself not just as an LLM router, but as the essential operating system for enterprise-grade Agent orchestration.

LiteLLM Launches Agent Platform: A K8s-Based Infrastructure for Production

Next Stories to Read

GPT-5.4 Detects Hidden Prompt Hook Bugs in Claude Code’s Harness

Claude Code v2.1.143: Fast Mode Upgraded to Opus 4.7 with Enhanced Plugin Management

Anthropic and Gates Foundation Commit $200M for AI in Developing Nations

Related Tools & Resources

Skill Marketplaces

Superpowers