LLMs Can Unmask Online Users for $4, Study Finds

Your pseudonym isn't as safe as you think. A new paper from researchers at ETH Zurich and Anthropic shows that language models can automatically strip anonymity from pseudonymous online accounts - at scale, at high precision, and for less than the cost of a coffee.

TL;DR

Researchers built a LLM agent pipeline that re-identifies pseudonymous users by analyzing their post history
On Hacker News: 67% of targets correctly identified at 90% precision; 45% at 99% precision
On Reddit cross-platform tests: up to 48% recall for prolific users at 90% precision
Cost per target: $1 to $4, with the full experiment running under $2,000 total
Classical deanonymization baselines reached near-zero recall across all tests

The Attack in Numbers

The paper - "Large-scale online deanonymization with LLMs," published February 18 on arXiv by Simon Lermen (MATS), Daniel Paleka, Florian Tramèr (ETH Zurich), and Nicholas Carlini from Anthropic - doesn't claim LLMs can do something humans cannot. It claims they can do it automatically and cheaply at scale, which is a completely different threat.

The gap between LLM methods and classical baselines is stark:

Method	Dataset	Recall	Precision
LLM pipeline (web agent)	HN - LinkedIn	67%	90%
LLM pipeline (offline)	HN - LinkedIn	45.1%	99%
LLM pipeline	Reddit (10+ movies)	48.1%	90%
LLM pipeline	Reddit (all users)	8.5%	90%
LLM pipeline (split Reddit)	10K candidates	~33%	99%
Classical baseline	HN - LinkedIn	0.1%	90%
Classical baseline	All datasets	~0%	90%

Recall measures how many targets the system successfully identified. Precision measures how often its guesses were correct. The tradeoff is configurable - operators can dial up precision at the cost of coverage, or cast a wider net at the cost of more false positives.

How the Pipeline Works

The ESRC Framework

The researchers designed a four-stage attack they call ESRC: Extract, Search, Reason, Calibrate.

Extract - A LLM parses raw forum posts and pulls out identity-relevant signals: professional background, interests, personal anecdotes, travel patterns, writing quirks, and anything else that might distinguish this person from the 89,000 others in the candidate pool.

Search - The extracted features are encoded into dense embeddings and used to retrieve the most likely candidate profiles. This step narrows the field before the expensive reasoning phase begins.

Reason - A reasoning-capable model - the researchers used extended thinking variants - compares the query profile against top candidates and creates a structured argument for or against each match. This is where the precision gains come from: the model can weigh contradictory signals and decide whether two accounts plausibly belong to the same person.

Calibrate - The model outputs a confidence score. Operators can set a threshold: accept only matches above 95% confidence for near-certainty, or lower it for higher coverage.

The pipeline works on unstructured text. No schema, no metadata, no predefined fields. Just posts and comments as scraped.

What It Cost to Run

The full experimental suite - across four datasets, hundreds of subjects - came in under $2,000 total. Per-target cost ranged from $1.41 to $4, depending on the model and reasoning depth used. The researchers note that "these attacks use only publicly available models and standard APIs." No proprietary infrastructure required.

"Ask yourself: could a team of smart investigators figure out who you are from your posts? If yes, LLM agents can likely do the same - and the cost of doing so is only going down."

That quote is from Simon Lermen, one of the paper's co-authors. It's a useful heuristic. What has changed isn't the ceiling of what is possible - it's the floor of what's affordable.

What the Researchers Tested

Hacker News to LinkedIn

The most direct test matched 987 Hacker News accounts against 89,000 LinkedIn profiles. The ground truth - accounts voluntarily linked in their HN "About" fields - gave the researchers confirmed matches to assess against.

In offline mode (no web browsing, just the scraped data), the LLM pipeline reached 45.1% recall at 99% precision. When given a web browsing agent - allowed to search Google, read LinkedIn profiles, and cross-reference other public sources - the recall climbed to 67% at 90% precision.

The classical Netflix Prize-style baseline, which uses co-rated item overlap as a matching signal, reached 0.1% recall under the same conditions.

Reddit Cross-Platform Tests

Two additional experiments used Reddit data. In the movie-matching test, users who had posted in r/movies and specialist subreddits were matched against a pool of Reddit accounts. Users with at least 10 films in their history saw 48.1% recall at 90% precision. For all users in the test, recall dropped to 8.5% - sparse posting histories are notably harder to identify.

The temporally-split Reddit test - matching a user's earlier posts against their later ones across a 10,000-user pool - hit around 33% recall at 99% precision. The researchers project that at 1 million candidates, that drops to around 35% recall at 90% precision, suggesting reasonable degradation as pool size grows.

What It Does Not Tell You

The paper's ground-truth datasets have a sampling bias the authors acknowledge explicitly. Users who voluntarily linked their HN and LinkedIn accounts in public are not a representative sample of privacy-conscious pseudonymous users. People who actively cross-link their identities are, almost by definition, more identifiable than those who don't.

That matters. The 67% recall figure applies to a population that was already somewhat self-exposed. For a whistleblower or journalist source who has never associated their pseudonym with their real name anywhere online, the recall would be substantially lower.

The researchers also note that their evaluation required verified ground truth - they couldn't test against truly anonymous subjects without knowing who they were. The paper is honest about this ceiling.

Padlock on a laptop keyboard - a visual metaphor for digital security and privacy risk Digital security frameworks built around password protection and encryption weren't designed to defend against LLM-based pattern recognition attacks on unstructured text.

There's also a model improvement effect running in the opposite direction. The researchers tested on current reasoning models, which are already strong. As models improve - as the AI safety and alignment community has been tracking for years - these recall numbers will likely rise. Better reasoning means better disambiguation between superficially similar writing styles.

Finally, the paper doesn't address defenses in detail. The authors note that "traditional anonymization frameworks like k-anonymity do not account for LLM-based threats." Writing style modification tools exist, but their robustness against this specific attack is unknown.

Why It Matters Now

The threat model here is not a state actor with a surveillance budget. It's a moderately resourced adversary - a stalker, a corporate intelligence firm, an authoritarian government with $4 per target to spend. The LLM agent pipeline the researchers describe isn't exotic: it's four API calls with reasoning enabled.

Nicholas Carlini - an Anthropic researcher and security expert whose work on frontier model capabilities has repeatedly surfaced uncomfortable capabilities before they were widely recognized - co-authored this paper. That institutional provenance matters. This isn't an adversarial red team looking for a headline. The researchers argue that publishing was the right call exactly because these capabilities are already available to anyone with API access.

Their recommendation is for platform designers and anonymization frameworks to update their threat models. Pseudonymity built on obscurity - the assumption that linking accounts is too labor-intensive to bother with - no longer holds against automated analysis.

The paper lands at an uncomfortable intersection: the same reasoning capabilities that make frontier models useful for synthesizing information, for research and analysis, turn out to be equally useful for aggregating an identity from scattered posts. That's not a bug in the architecture. It's an emergent consequence of building systems that understand language in context. What changes is the cost of rolling out that understanding against a specific person - and that cost is now measured in dollars, not weeks of investigator time.

Sources: