SOURCE // NEWS

Elon Musk’s xAI Secretly Trained Coding Models on Claude Outputs Before Ban

Elon Musk’s xAI Secretly Trained Coding Models on Claude Outputs Before Ban

Elon Musk's AI startup, xAI, reportedly spent months distilling Anthropic's Claude outputs to train its proprietary coding models, according to a report by The Information. This revelation highlights the tech industry's ongoing, controversial reliance on competitors' data for model training.

Although Anthropic detected the scraping and revoked xAI's official API access in January, xAI engineers reportedly bypassed the restriction by using personal accounts and the intermediary developer service Blackbox AI. Musk has previously admitted in legal proceedings that Grok was "partially" trained on OpenAI data, describing the practice as an industry standard.

Beyond data scraping controversies, xAI is reportedly grappling with severe internal turmoil. Its core pretraining team has shrunk to fewer than five members, and four Grok code leads, along with several co-founders, have departed within months. Furthermore, an engineer accidentally deleted critical training data, which delayed development by two to three weeks.

Meanwhile, the massive compute cluster Musk secured is not being fully utilized for xAI's internal training. Instead, xAI has been renting out its compute capacity to Anthropic (via SpaceX) and Google, which sources claim is a temporary stopgap.

[AgentUpdate Depth Analysis] The revelation that xAI relied on distilling Claude's outputs underscores a critical bottleneck in the AI Agent ecosystem: the scarcity of high-quality, agentic training data. Since Claude is widely regarded as the gold standard for coding and reasoning—powering leading Agent-based IDEs like Cursor—other LLM builders are increasingly tempted to skip costly data curation by "shadowing" its outputs. However, while distillation helps model developers bootstrap basic code-generation capabilities, it fundamentally limits the model's ability to transcend the teacher's capabilities. As the AI Agent ecosystem shifts from simple code completion to autonomous, multi-step execution and self-healing, relying on static distilled data will yield diminishing returns. The future of autonomous coding Agents lies in reinforcement learning through environment feedback (RL), rather than copy-pasting the outputs of dominant proprietary models. Distillation is a shortcut, but self-evolution is the only path to state-of-the-art performance.