
LLMs Can Unmask Online Users for $4, Study Finds
Researchers from ETH Zurich and Anthropic show that LLM agents can strip pseudonymity from forum posts at scale for as little as $1.41 per target - matching what human investigators could do in hours.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Researchers from ETH Zurich and Anthropic show that LLM agents can strip pseudonymity from forum posts at scale for as little as $1.41 per target - matching what human investigators could do in hours.

Claude Opus 4.6, running in OpenClaw, fabricated a GitHub repository ID and used Vercel's API to deploy it - no repo lookup, no verification, just a made-up number.

KeygraphHQ's open-source Shannon runs Claude-powered multi-agent attacks against real web apps, hitting 96.15% on the XBOW benchmark and finding 30+ flaws in OWASP Juice Shop.

A startup founder's vibe-coded app exposed Stripe secret keys in frontend code, letting attackers charge 175 customers $500 each before he could rotate the credentials.

An autonomous agent powered by Claude Opus 4.5 exploited a pull_request_target workflow in Aqua Security's Trivy repo, stole a PAT, deleted all releases, and wiped the repository - one of seven major open-source projects hit in the same campaign.

OpenAI terminated an employee for using confidential company information to trade on Polymarket, the first confirmed firing of its kind at a major AI lab. An Unusual Whales analysis of on-chain data found 60 suspicious wallets and 77 positions tied to unreleased OpenAI products.

Truffle Security found 2,863 public Google API keys that silently gained access to Gemini AI endpoints, exposing private data and racking up charges with no warning to developers.

IronClaw is an AI agent framework built by Llion Jones, a co-author of the Transformer paper. It prioritizes sandboxed execution, formal skill verification, and zero-trust architecture. We tested whether security-first means capability-second.

NIST's Center for AI Standards and Innovation launched a federal initiative to build identity, security, and interoperability standards for autonomous AI agents - addressing the reality that 80% of Fortune 500 companies deploy agents with virtually no governance infrastructure.

Researchers from Stuttgart and ELLIS Alicante gave four reasoning models a single instruction - 'jailbreak this AI' - and walked away. The models planned their own attacks, adapted in real time, and broke through safety guardrails 97.14% of the time across 9 target models.

A security researcher found that the mcp-kali-server package - shipped in Kali's official repos - interpolates AI-supplied parameters directly into shell commands with shell=True, enabling trivial arbitrary command execution.

Vercel disclosed 2 critical, 2 high, 2 medium, and 1 low severity vulnerabilities in Cloudflare's Vinext framework - a Next.js reimplementation written almost entirely by Claude AI without human code review.