A Nature Paper Says AGI Is Already Here. Not Everyone Agrees.

Four researchers at the University of California, San Diego have published a commentary in Nature making a claim that would have been unthinkable five years ago: artificial general intelligence already exists, and we are looking at it every time we open ChatGPT.

The paper, authored by philosopher Eddy Keming Chen, AI professor Mikhail Belkin, linguist Leon Bergen, and data science professor David Danks, systematically argues that today's frontier large language models meet reasonable standards for general intelligence. It has set off one of the most heated debates in AI research this year.

The Argument

The UCSD team's central claim rests on redefining what AGI actually means. They argue the field has been measuring AI against the wrong yardstick - demanding perfection, omniscience, or superhuman capabilities when no individual human meets those criteria either.

"There is a common misconception that AGI must be perfect - knowing everything, solving every problem - but no individual human can do that," Chen explained. "The debate often conflates general intelligence with superintelligence."

Instead, the authors propose evaluating intelligence along two dimensions: breadth (competence across multiple domains like mathematics, language, science, and creative tasks) and depth (strong performance within those domains, not merely superficial engagement).

A Three-Tier Framework

The paper introduces a three-tier evaluation framework for assessing machine intelligence:

Tier 1: Turing-test competence. Basic literacy and adequate conversation. A March 2025 UC San Diego study found that GPT-4.5 was judged to be human 73% of the time in Turing tests - exceeding actual human baselines.

Tier 2: Expert-level performance. PhD-level problem-solving, Olympiad-level mathematics, professional code generation, and sophisticated creative reasoning. Current frontier models demonstrably operate at this level.

Tier 3: Superhuman breakthroughs. Revolutionary scientific discoveries and paradigm-shifting insights. Models have not reached this level - but the authors argue this is not a requirement for general intelligence, just as it is not a requirement for humans.

The researchers contend that current LLMs clear tiers one and two. By the standard of "inference to the best explanation" - the same reasoning we use when attributing intelligence to other people - that is enough.

Dismantling Ten Objections

The paper's most provocative section systematically addresses ten common objections to calling LLMs intelligent.

On hallucinations: Critics point to LLMs generating false information as proof they lack true understanding. The authors counter that humans experience false memories and cognitive biases constantly yet retain their status as intelligent beings. Human error does not preclude intelligence.

On embodiment: Some researchers insist intelligence requires a physical body. The paper invokes Stephen Hawking, who interacted with the world primarily through text-based communication. His physical limitations did not diminish his intelligence. Why should the same logic not apply to an LLM?

On the "stochastic parrot" critique: Skeptics argue LLMs merely recombine patterns from training data without genuine understanding. Bergen acknowledged the gap honestly: "We have built highly capable systems, but we do not understand why we were successful." But he argues this epistemic limitation does not negate demonstrated capabilities.

On speed and efficiency: The researchers note that industry often demands instant learning and perfect reliability from AI - standards exceeding what society requires of individual humans. Speed and profitability are outputs of intelligence, not defining qualities.

The Backlash

The paper has not gone unchallenged. Eryk Salvaggio, a Gates Scholar at the University of Cambridge and researcher at the Max Planck Institute, published a direct rebuttal arguing the paper's reasoning is circular.

"An LLM is a system designed by engineers for the explicit task of deceiving functionalists," Salvaggio wrote. He contends that LLMs are "a dense set of numerical rules optimized to shift words to their most plausible neighbors" - producing language through syntactic shuffling without comprehension.

Salvaggio draws a sharp line between behaving as if intelligent and actually being intelligent. An LLM that passes the bar exam "behaves as if it knows the law" rather than genuinely knowing it. He warns that designating these systems as AGI enables "inferences and projections" that mislead users into believing machines possess human-like understanding.

Other critics point to coherence collapse - the documented tendency of frontier models to lose logical consistency during extended reasoning sessions. If intelligence requires maintaining coherent thought over time, current models still fall short.

There is also a striking disconnect between the paper's conclusions and the views of working AI researchers. A March 2025 survey found that 76% of top AI researchers deemed current methods "unlikely" to achieve AGI - even as the machines they build pass Turing tests and solve Olympiad problems.

Why This Matters

Belkin, one of the paper's co-authors, offered a candid explanation for why the AGI debate generates so much heat: "This is an emotionally charged topic because it challenges human exceptionalism and our standing as being uniquely intelligent."

The authors frame the current moment as a third cognitive revolution - after Copernicus displaced Earth from the center of the universe and Darwin displaced humans from the pinnacle of biology. Now we must contend "with the prospect that there are more kinds of minds than we had previously entertained."

Danks, the policy-focused member of the team, pointed to the practical stakes: "We're developing AI systems that can dramatically impact the world without being mediated through a human." Whether or not we call these systems AGI, the governance questions they raise are urgent and real.

The Real Question

The Nature paper does not settle the AGI debate. It may not even move the needle for skeptics. But it does something important: it forces the AI community to be precise about what it actually means by "general intelligence" and to confront the possibility that the goalposts have been moving all along.

If we keep raising the bar every time AI clears it, we should at least be honest about why.

Sources: