SOURCE // PODCASTS

How OpenAI's Reasoning Model Overturned an 80-Year-Old Erdős Math Conjecture

How OpenAI's Reasoning Model Overturned an 80-Year-Old Erdős Math Conjecture

In a recent episode of the OpenAI podcast hosted by Andrew Maine, core reasoning research team members Alexander Wei, Hongxun Wu, and Lijie Chen shared how their reasoning model successfully disproved an 80-year-old math conjecture proposed by the legendary mathematician Paul Erdős—specifically, the Unit Distance Conjecture.

The conjecture, a central problem in combinatorial geometry, suggested that a square grid was the optimal arrangement for placing n points with exactly 1-inch distances. Erdős even put a $500 bounty on it. The model disproved this by utilizing advanced Class Field Theory to construct a highly symmetrical, non-intuitive geometric design, breaking past previous asymptotic limits. Initially skeptical, the team had internal world-class mathematicians scrutinize the proof, which ultimately left them sleepless and thrilled after failing to find a single bug.

Alexander explained that unlike standard auto-regressive models that output tokens immediately, the new reasoning models leverage test-time compute. This allows the model to spend more compute budget exploring different paths and self-correcting before generating an answer. Crucially, accuracy scales exponentially with increased reasoning time, pushing the model's success rate on this hard problem to nearly 50%. This is a general-purpose model, not one fine-tuned specifically for mathematics.

Fascinating behaviors emerged during the process. When searching the web, the model first defined the word "Unit" in the Cambridge Dictionary to ground its understanding precisely, showing strong grounding abilities. While the final mathematical proof was elegant, the model's generated Chain of Thought spanned 125 pages, capturing failed creative attempts and successfully synthesizing remote academic fields (number theory and geometry).

The breakthrough has directly catalyzed human mathematics. Following the model's proof, human mathematicians refined the boundaries and, within a week, disproved another major mathematical conjecture (the Sum-product Conjecture). Rather than replacing mathematicians, the researchers view AI as a powerful tool to connect seemingly disparate fields, whereas humans excel at constructing macro-theories like P vs NP. The team emphasizes that their goal is not to flood the community with "AI slop" by solving thousands of conjectures automatically, but to empower scientists globally.

For future milestones, the researchers look toward solving P vs NP and utilizing AI to accelerate AI research. Furthermore, the technology holds promise in cryptography and finding better quantum error-correcting codes in quantum computing. For researchers, they suggest asking macro-level questions directly rather than over-decomposing problems, as human-guided steps often introduce biases. Researchers can also query the model line-by-line to understand its proof steps, using it as a highly patient tutor.

[AgentUpdate Depth Analysis] This breakthrough by OpenAI's reasoning model signals a monumental paradigm shift in the AI Agent landscape: the transition from "System 1" pattern matching to "System 2" deliberate reasoning. By demonstrating that test-time compute can solve highly complex, unsolved scientific problems, OpenAI proves that reinforcement learning-driven self-correction can bypass the fragile, hand-crafted agentic workflows that dominate today's software engineering (e.g., rigid prompt-chaining frameworks). The creation of a 125-page chain of thought indicates that the future bottleneck of AI Agents will pivot from raw context window sizes to compute allocation efficiency over extended temporal horizons. This generalist reasoning capability lays the groundwork for "Discovery Agents" that do not merely execute deterministic API calls, but autonomously synthesize cross-disciplinary insights to unlock novel paradigms. Consequently, humans will shift from being task orchestrators to system architects, guiding AI Agents to conquer the blind spots of human intuition.