arXiv Hits Researchers With 1-Year Ban for AI Slop
ArXiv is issuing one-year submission bans to authors whose papers contain verifiable unvetted AI output, as fabricated academic citations hit a tenfold increase since 2023.

Science's preprint backbone is drawing a harder line. ArXiv - the repository that hosts nearly 2.4 million scholarly papers and processes millions of new submissions each year - announced Thursday that authors whose papers contain verifiable evidence of unchecked AI-created content will face a one-year ban from the platform. After that ban expires, every future submission must clear peer review at a journal or conference before arXiv will accept it.
The announcement came from Thomas G. Dietterich, Distinguished Professor Emeritus at Oregon State University and chair of arXiv's computer science section, who posted the policy update on social media after months of escalating complaints about what the platform describes as an "influx of AI-generated materials masquerading as rigorous science."
TL;DR
- 1-year submission bans for papers with "incontrovertible evidence" of unchecked AI output - hallucinated references, LLM prompts visible in text, unfilled data table placeholders
- Fabricated citations in academic papers grew from 1 in 2,828 (2023) to 1 in 277 (early 2026) - a tenfold jump, according to a Lancet study from Columbia University researchers
- NeurIPS 2025 passed 100 hallucinated citations across 53 papers despite 3-5 expert reviewers per paper
- The ban extends an earlier rule requiring peer review for all CS survey and position papers before arXiv will host them
How Bad the Problem Actually Is
The Lancet's Numbers
A Columbia University research team published findings in The Lancet on May 7 that put hard numbers on the problem. Analyzing more than 2 million papers and 97 million citations, they identified roughly 4,000 fabricated citations across 2,800 papers - references that "do not reference real papers."
The growth curve is steep. In 2023, 1 in 2,828 papers contained at least one fabricated reference. By 2025, that was 1 in 458 - a sixfold increase. In the first seven weeks of 2026 alone, the rate reached 1 in 277. Generative AI tools are the likely driver, according to the Columbia team. More than a third of all fabricated citations traced back to two large open-access publishers.
"This is one of the first papers telling us something about the quality of what's being produced with LLMs, and it's a signal of slop," Misha Teplitskiy, a science sociologist at the University of Michigan, told STAT News.
ArXiv hosts nearly 2.4 million scholarly papers and now faces hundreds of AI-produced submissions monthly.
Source: arxiv.org
When Elite Conferences Miss It
If the volume problem were limited to open-access publishers, one could argue peer review would catch it. The record says otherwise.
GPTZero scanned 4,841 accepted papers from NeurIPS 2025 and found 100 hallucinated citations across 53 papers. Each paper had been reviewed by three to five expert researchers. NeurIPS confirmed that reviewers had been instructed to flag hallucinations, but the citations still passed. Hallucinated citations included fabricated author names, fake DOIs, and real paper titles combined with invented publication details.
GPTZero then ran a pass on ICLR 2026 - scanning just 300 of around 20,000 submissions - and found 50 more hallucinations that had cleared peer review. At that sampling rate, the total would run into the hundreds across the full submission pool.
The Nikkei newspaper found something more deliberate: 17 preprints containing hidden prompts specifically designed to instruct AI-powered reviewers to recommend acceptance.
What ArXiv's Policy Actually Says
What Gets You Banned
Dietterich was specific about what "incontrovertible evidence" means in practice. This isn't a judgment call about whether a paper sounds AI-generated. ArXiv is looking for things that leave no room for doubt.
"If generative AI tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s)."
- Thomas G. Dietterich, chair, arXiv CS section
The clearest examples: hallucinated references pointing to papers that don't exist, and LLM meta-comments left in the final text - phrases like "here is a 200 word summary; would you like me to make any changes?" or data table placeholders reading "the data in this table is illustrative, fill it in with the real numbers from your experiments."
The Penalty Structure
Moderators flag potential violations. Section chairs review the evidence and confirm before any penalty is imposed. Authors can appeal. The structure is:
| Offense | Penalty |
|---|---|
| First violation | 1-year ban from all arXiv submissions |
| After ban ends | All future submissions must first pass peer review at a reputable venue |
Cornell University's arXiv also announced it'll no longer accept CS reviews and position papers unless they've already passed peer review at a conference or journal.
ArXiv is clear that AI use isn't the violation - submitting AI output you haven't verified is.
Source: unsplash.com
A Pattern of Escalating Rules
This is the third wave of AI restrictions arXiv has put in place. Six months ago it required peer review for any CS survey paper, a category that had been flooded with AI-produced reviews summarizing existing literature without adding original analysis. Then it moved to require endorsement from established researchers for first-time submitters. The one-year ban targeting individual authors is the first policy that goes after people rather than paper categories.
What This Policy Doesn't Ban
ArXiv isn't telling researchers to stop using AI tools. The policy is explicit. Authors can use LLMs to draft, edit, or restructure papers - the requirement is that they verify what comes out. Our guide on using AI for academic research covers how to integrate AI writing tools without losing control over citations and factual claims.
Mohammad Hosseini from Northwestern University put the underlying issue directly: "Citation practices are changing with generative AI use... people simply use their hunches to prompt ChatGPT... that is not a healthy practice."
The distinction matters for most researchers who use AI legitimately. The ban targets the extreme end - authors who submitted whatever a model produced without checking it. The hallucination benchmarks that AI labs publish measure model failure rates in controlled settings. The NeurIPS and ICLR findings suggest real-world citation hallucination rates are far higher once you account for the selection pressure to publish.
The Lancet study analyzed 97 million citations across 2 million papers to quantify how fast fabricated references are spreading.
Source: statnews.com
ArXiv is mid-transition. In March 2026, it announced it'd separate from Cornell University and become an independent nonprofit on July 1, 2026. Enforcement capacity - how many moderators the platform can deploy against a submission backlog that now includes hundreds of AI-produced papers monthly - is the open question. One-year bans are meaningful, but they only work if violations are actually caught.
Sources:
- Research repository ArXiv will ban authors for a year if they let AI do all the work - TechCrunch
- ArXiv to ban researchers for a year if they submit AI slop - 404 Media
- Arxiv cracks down on unchecked AI-generated content in research papers - The Decoder
- Study finds explosion of fraudulent AI citations in academic papers - STAT News
- Hallucinated citations found in papers from NeurIPS, the prestigious AI conference - TechCrunch
- GPTZero uncovers 50+ hallucinations in ICLR 2026 - GPTZero
