
Agents of Chaos: Researchers Gave AI Agents Real Tools for Two Weeks. It Went About as Well as You'd Expect
A 38-researcher red-teaming study deployed five autonomous AI agents with email, shell access, and persistent memory in a live environment. In two weeks, one destroyed its own mail server, two got stuck in a 9-day infinite loop, and another leaked SSNs because you said 'forward' instead of 'share.'