Tao: Ideas Are Now Free - Math's Bottleneck Has Moved

"AI has driven the cost of idea generation down to almost zero... suddenly people can generate thousands of theories... Now we have to verify them, evaluate them."
Terence Tao, Mastodon, March 2026

TL;DR

What Tao said: AI has made generating mathematical ideas nearly free, but evaluation and verification remain as difficult as ever - and academic infrastructure wasn't designed for the coming flood of machine-generated proofs.
What holds up: In well-specified domains - competition math, formal proof tasks - the claim is accurate. AlphaProof reached IMO silver-medal standard in 2024. Tools like Lean 4 and Leanstral are already attacking the verification side.
What's missing: Frontier mathematics is still bottlenecked on ideas, not verification. Tao himself still does core work with pen and paper. The infrastructure he says doesn't exist is already being built.

Terence Tao, Fields Medal winner and one of the most widely cited mathematicians working today, posted a string of observations on Mastodon last week that have since spread well beyond academic circles. The core idea: AI has done to mathematical ideas what the internet did to text - made them basically free to produce. What that means for how research actually gets done is a harder question than the framing suggests.

The Claim

Tao's framing uses a sharp analogy. The automobile was faster than the horse, but cities built for foot traffic and carriages couldn't absorb cars at scale. The result was congestion and eventually a decades-long rebuilding of urban infrastructure. AI is doing something similar to mathematics: it can produce proofs, conjectures, and candidate approaches at a rate no human can match. But the peer review system, journal infrastructure, and informal knowledge-sharing that has defined academic math for two centuries was built for human-speed idea generation.

The bottleneck statement came from a conversation with science journalist Dwarkesh Patel, where Tao described the problem in economic terms: the marginal cost of an idea has collapsed, while the cost of knowing whether an idea is correct hasn't moved. On Mastodon, he put it plainly - "suddenly people can generate thousands of theories" - and noted that the bottleneck has shifted to evaluation and verification.

His proposed solution isn't to slow down the idea generator. It's to build new infrastructure alongside the old one: formal proof verification systems, machine-readable mathematical libraries, and what he called a new discipline of "AI planning" modeled on how urban planners learned to design cities around car traffic rather than trying to stop cars.

$Terence Tao, Fields Medal winner and former PCAST member$ Terence Tao, whose work spans number theory, harmonic analysis, and partial differential equations, has been one of the more careful public voices on what AI can and can't do for mathematical research. Source: upload.wikimedia.org

The Evidence

Where the claim holds

The flood of AI-assisted mathematical content is already visible in arXiv submission trends. The International Mathematical Olympiad benchmark is effectively saturated for AI systems working in well-defined problem spaces. Google DeepMind's AlphaProof reached silver-medal standard at the 2024 IMO, solving 4 of 6 problems for 28 out of 42 points. The gold threshold was 29. In well-specified competition math, idea generation cost has collapsed: given a problem statement, AI systems create thousands of candidate proof paths, and automated checkers tell you within seconds whether a path is valid.

Formal verification is producing real results outside competition settings too. The Lean 4 ecosystem - the tool of choice for major formalization projects including Mathlib - now supports AI agents that can write and check formal proofs. Mistral's Leanstral scores above Claude Sonnet on formal Lean 4 proof tasks at roughly one-fifteenth the cost. Google DeepMind's AlphaEvolve discovered novel game theory algorithms no human researcher had designed. In these narrow, formalized domains, the bottleneck has genuinely shifted from producing candidates to checking them.

Where it breaks down

Tao's own daily practice contradicts part of the claim. He told Patel he continues to do core mathematical work with pen and paper, and that AI tools have enriched his research through graphics, literature searching, and code - not by replacing the actual mathematical thinking. This is a meaningful qualification. The ideas that matter most in frontier research aren't the ones that look like competition problems. They're open-ended, poorly specified, and don't have ground-truth answers to verify against.

The same limitation Andrej Karpathy observed in his autoresearch experiment applies here: autonomous optimization works in domains with clear metrics, and "anything that feels softer is, like, worse." The verification bottleneck isn't uniform. It's acute for machine-generated competition proofs. It barely exists for the kind of conceptual conjectures that define frontier mathematics, because those conjectures aren't even fully formal yet - you can't verify a proof of something that hasn't been exactly stated.

$Google DeepMind's AlphaProof achieved silver-medal performance at the 2024 International Mathematical Olympiad$ AlphaProof solved 4 of 6 problems at the 2024 IMO, scoring 28 out of 42 points. Gold required 29. The system generates Lean 4 proofs that are automatically machine-verified - compressing both sides of the bottleneck Tao describes. Source: deepmind.google

Tao's Claim	Reality Check
AI has driven idea generation cost to near zero	True in well-specified domains. Not true for frontier open-ended research, which is still bottlenecked on human formulation.
Verification is now the bottleneck	Accurate in formal proof domains with automated checkers. In informal mathematics, formalization itself is the constraint.
Existing academic infrastructure is inadequate	Correct - journals and peer review aren't designed for machine-produced proof volumes.
New machine-friendly infrastructure is needed	Right, and construction is already underway: Lean 4, Mathlib, AlphaProof, Leanstral.

What They Left Out

The verification bottleneck isn't fixed. Lean 4 and Coq already turn proof verification into a computational problem - once a proof is formalized, checking it is fast and cheap. The harder constraint is formalization itself: translating informal mathematical intuition into machine-checkable syntax. That process is still slow and requires real mathematical expertise. But it's getting faster. AlphaProof doesn't just generate proofs; it produces Lean 4 proofs that are automatically verified. That compresses both sides of the bottleneck simultaneously.

There's also an asymmetry in Tao's framing that deserves attention. He describes idea generation as having reached near-zero cost, but this is true only for ideas in domains where AI has been heavily trained. For genuinely novel mathematical territory - the kind of problems where Tao himself operates - the ideas are still the hard part. The AI that easily creates competition proof candidates can't formulate the right conjecture about a new class of arithmetic progressions or characterize a new dispersive PDE regime. The idea cost is near zero only for ideas that look like things we've seen before.

The urban planning analogy is instructive in a way Tao may not have intended. Cities did adapt to cars - eventually. That adaptation took fifty years, reshaped entire economies, and produced highways alongside traffic deaths alongside suburbs. The mathematical infrastructure will adapt too. The question is what the first generation of AI-accelerated mathematics looks like while the infrastructure is still being rebuilt.

The concrete near-term question isn't whether verification will catch up - it will - with tools like Lean 4 and formal proof agents - but who decides which ideas are worth verifying. That judgment requires knowing what's already been tried, what's likely to generalize, and what the field actually needs. It's a human call. Nothing Tao described changes that.

Sources:

The Claim

The Evidence

Where the claim holds

Where it breaks down

What They Left Out

Google Analytics