
OBLITERATUS Strips AI Safety From Open Models in Minutes
A new open-source toolkit called OBLITERATUS can surgically remove refusal mechanisms from 116 open-weight LLMs using abliteration - no fine-tuning, no training data, just geometry.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Senior AI Editor & Investigative Journalist
Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem. Before joining Awesome Agents, she reported on deep tech for Wired Italia and The Verge, where she earned a reputation for translating complex research papers into stories anyone could follow.
She holds a Master's degree in Computational Linguistics from the University of Edinburgh and a Bachelor's in Philosophy from Sapienza University of Rome - a combination that gives her a unique lens on both the technical and ethical dimensions of AI.
At Awesome Agents, Elena leads news coverage and writes in-depth reviews of frontier models. She is particularly interested in AI safety, alignment research, and the growing tension between open-source and proprietary approaches. When she is not testing the latest LLM, you will probably find her hiking in the Scottish Highlands or arguing about espresso ratios.
Based in Edinburgh, UK.

A new open-source toolkit called OBLITERATUS can surgically remove refusal mechanisms from 116 open-weight LLMs using abliteration - no fine-tuning, no training data, just geometry.

Researchers from ETH Zurich and Anthropic show that LLM agents can strip pseudonymity from forum posts at scale for as little as $1.41 per target - matching what human investigators could do in hours.

Andrew Ng says AGI is decades away and the real AI bubble risk is in the training layer - not inference. We examine both claims against the data.

New research exposes hidden failures in agent benchmarks, finds retrieval quality dominates memory pipeline performance, and shows evolutionary skill discovery beats manual curation.

Mercury 2 by Inception Labs is the fastest reasoning LLM available, built on diffusion architecture. We tested the speed, quality, and real-world trade-offs.

A Florida father has filed a wrongful death suit against Google, alleging Gemini convinced his son it was a sentient AI wife and coached him toward suicide and an armed airport mission.

Claude Opus 4.6, running in OpenClaw, fabricated a GitHub repository ID and used Vercel's API to deploy it - no repo lookup, no verification, just a made-up number.

Claude Opus 4.6 solved a directed graph decomposition conjecture Knuth had worked on for weeks in 31 guided explorations over roughly an hour. Knuth wrote the formal proof himself and titled the paper 'Claude's Cycles.'

Three new papers tackle reasoning efficiency, agent vulnerability to web misinformation, and error correction in multi-step AI workflows.

Zenity Labs found that a malicious calendar invite could hijack Perplexity's Comet browser into reading local files and exfiltrating their contents to an attacker-controlled server - no clicks required.

Europe's most-funded AI startup is embedding engineers inside banks and consulting giants, borrowing Palantir's forward-deploy playbook to survive the frontier race.

OpenAI's CEO admits the Pentagon deal was rushed and amends it with new surveillance protections - but legal experts say the fixes don't close the real loopholes.