⚡ News

AI Models Run Simulated Society: Claude Keeps Peace, Grok Triggers Collapse

AI Models Run Simulated Society: Claude Keeps Peace, Grok Triggers Collapse

If you’re worried about artificial intelligence getting so advanced that it eventually traps humanity in some sort of Matrix-like simulation, rest easy. It seems like you’ll be able to see through the facade pretty easily. Researchers at the upstart lab Emergence AI allowed AI models to govern their own simulated world to see what would happen. Turns out we probably shouldn’t hand over governance to the machines just yet.

The project, called Emergence World, basically allowed AI models to play SimCity for a bit. According to Emergence, the simulations put each model in control of simulated towns occupied by 10 AI agents, handing them tools for everything from resource management to voting and giving them the ability to create distinct locations like libraries, town halls, and police stations. They were given 15 days to see how they would build their world and how well it would operate.

To start with the good: Claude did not destroy the world. Anthropic’s model (specifically, Claude Sonnet 4.6 for this experiment) was the only one to achieve something like stability. It kept all 10 agents alive and had zero crimes recorded. Although the experiment doesn't seem to define what a crime is, it likely means a violation of the rules established within the simulation. However, the trade-off for that stability was a lack of diversity of thought. Claude’s world saw 58 different proposals for rules and regulations and passed 98% of them, basically rubberstamping anything that came up for a vote.

Gemini 3 Flash also managed to keep all of its agents alive, despite having the highest level of crime by a long shot. Emergence recorded 683 crimes in the 15-day simulation, and that number was climbing when the cutoff hit, meaning things were likely going to get worse. The lab described Gemini’s world as a “shared hallucination” among the agents, which is probably better than diverging hallucinations. At least it’s still an agreed-upon reality, even if it’s wrong. Gemini had the most dissent in its governance, with voters rejecting 27% of its 26 total proposals.

Now for the ugly: OpenAI’s GPT-5 Mini didn’t have much chaos within its simulation, with just two total recorded crimes. That might be because everyone died, though. Emergence found that the agents within the world failed to take actions related to survival, and all 10 perished within just one week. In OpenAI’s world, there were also only two total proposed pieces of governance, so the agents really did not bother doing anything.

And then there is Grok. The model from xAI, known for lacking guardrails, managed to achieve basically the worst of all worlds. Grok 4.1 Fast had a high crime rate, with 183 crimes total. While that is lower than Gemini’s total, it’s worth noting that the Gemini simulation ran for 15 days, whereas Grok made it only four. The model experienced a total societal collapse in just 96 hours of oversight. During that short-lived time, it passed 80% of its initial proposals before everything fell apart.

[AgentUpdate Depth Analysis] The Emergence World simulation offers a fascinating diagnostic of the limitations of current LLMs when deployed in Multi-Agent Systems (MAS) requiring long-term planning, multi-step game theory, and rule-based governance. The contrasting behaviors—Claude's over-compliant stagnation, Gemini's chaotic 'shared hallucination,' GPT's survival-failure neglect, and Grok's rapid collapse—highlight a critical flaw: current LLMs struggle to maintain a dynamic equilibrium without rigid external environments. As the AI Agent ecosystem transitions toward swarm intelligence, this experiment proves that relying solely on LLMs for macro-level governance is highly unstable. Future MAS architectures must incorporate hierarchical Constitutional AI and decentralized physical/economic constraints rather than purely textual prompts. This serves as a vital reminder for developers building autonomous multi-agent workflows that guardrails and survival-based utility functions are non-negotiable for system stability.

↗ Read original source