News

JustPaid: How a 9-Person Startup Leveraged AI Agents to Ship 10 Major Features Monthly

JustPaid: How a 9-Person Startup Leveraged AI Agents to Ship 10 Major Features Monthly

JustPaid, a 9-person startup based in Mountain View, has made waves by effectively replacing its development team with AI. By combining the open-source agent orchestration system OpenClaw with Claude Code, they deployed seven AI agents that handle code writing, review, and quality assurance (QA) around the clock. The result? Ten major features shipped in a single month – a feat that would typically take human engineers a month or more for each feature.

This success story is widely circulating as compelling evidence that the autonomous engineering team is not just a concept, but a reality. However, a crucial detail often overlooked is the underlying cost, which is paramount for anyone aiming to replicate such a system.

The $4,000 Weekly Bill

Initially, when JustPaid's CTO integrated Claude Code and OpenClaw, the weekly expenditure on tokens alone amounted to $4,000, translating to a hefty $16,000 per month.

Through meticulous tuning – which involved switching to smaller, more appropriate models for specific tasks, tightening context windows, and reducing unnecessary agent calls – they managed to bring the monthly cost down to $10,000-$15,000.

This figure remains substantial. However, considering that a mid-level San Francisco engineer's fully loaded monthly cost typically ranges from $15,000 to $20,000, the math can indeed work for AI agents. The critical caveat is the deliberate and rigorous management of token spend. Left unchecked, multi-agent systems can accrue costs at an alarming rate. Our own experience confirms this: agents performing background tasks can compound costs invisibly until the API invoice arrives. A single agent loop making 50 tool calls per task and executing 100 tasks daily can burn through tokens rapidly, with the true financial impact only becoming apparent at month-end.

Understanding OpenClaw's Role

The Wall Street Journal accurately described OpenClaw as the "brain" and Claude Code as the "hands," providing a useful framework.

OpenClaw is, in essence, an open-source agent orchestration system. Its responsibilities include task planning, agent spawning, subagent delegation, and managing file access. Claude Code, on the other hand, performs the actual code execution. It's the synergistic combination of these two components that enabled JustPaid's innovative setup, not either tool in isolation.

This architecture exemplifies the multi-agent pattern: a coordinator model for planning and delegation, specialized agents for execution, and a review layer to validate work before commitment. JustPaid's seven agents each have defined, specialized roles: writer, reviewer, and QA. This structured approach is vital, as single agents attempting to do everything often fail predictably. Specialized agents with clearly defined scopes are less prone to failure and significantly easier to debug when issues do arise.

The Supervision Imperative

As Tatyana Mamut from Wayfound directly stated in the article, agents empowered to make autonomous decisions require continuous supervision.

Her point is critical. While the JustPaid narrative is compelling, it's important to remember this is a 9-person startup where the CTO personally architected the system and maintains intimate knowledge of its operations. He serves as the indispensable supervisor.

In larger organizations, this crucial supervision layer is often absent by default. Agents might access files, write code, send messages, and interact with external APIs without a dedicated human reviewing every action. This lack of oversight is precisely where critical problems can emerge. The Kuse example, also cited in the original article, presents an even more ambitious deployment where AI agents possess their own Slack and Gmail identities, participate in Zoom calls, and proactively initiate work. While impressive, such advanced autonomy inherently introduces a significantly larger attack surface and magnifies the need for robust oversight mechanisms.

↗ Read original source