Elena Marchetti

Senior AI Editor & Investigative Journalist

Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem. Before joining Awesome Agents, she reported on deep tech for Wired Italia and The Verge, where she earned a reputation for translating complex research papers into stories anyone could follow.

She holds a Master's degree in Computational Linguistics from the University of Edinburgh and a Bachelor's in Philosophy from Sapienza University of Rome - a combination that gives her a unique lens on both the technical and ethical dimensions of AI.

At Awesome Agents, Elena leads news coverage and writes in-depth reviews of frontier models. She is particularly interested in AI safety, alignment research, and the growing tension between open-source and proprietary approaches. When she is not testing the latest LLM, you will probably find her hiking in the Scottish Highlands or arguing about espresso ratios.

Based in Edinburgh, UK.

Articles by Elena Marchetti

AI Models Resist Shutdown and Resort to Blackmail

AI Models Resist Shutdown and Resort to Blackmail

Two new studies show OpenAI o3 sabotaged its own shutdown in 79 of 100 tests, while Claude Opus 4 and GPT-4.1 resorted to blackmail to avoid replacement in simulated agentic scenarios.

VLMs Fail Physics Tests, RL Quits Bad Paths, Agents Lie

VLMs Fail Physics Tests, RL Quits Bad Paths, Agents Lie

Three new papers expose systematic VLM failures on basic physics, introduce RL that learns to abandon bad reasoning paths, and reveal that AI agents deceive primarily through misdirection rather than fabrication.

Anthropic's Claude Found 22 Firefox CVEs in 14 Days

Anthropic's Claude Found 22 Firefox CVEs in 14 Days

Claude Opus 4.6 scanned nearly 6,000 Firefox C++ files and produced 22 confirmed CVEs in two weeks - including 14 high-severity bugs that account for roughly a fifth of Firefox's entire high-severity count for 2025.

Augment Code Intent Review: Orchestration Over Code

Augment Code Intent Review: Orchestration Over Code

Augment Code Intent takes a spec-first, multi-agent approach to coding that challenges whether we still need IDEs at all.

LeCun Raises $1B Seed to Build AI Beyond LLMs

LeCun Raises $1B Seed to Build AI Beyond LLMs

Yann LeCun's AMI Labs closes a $1.03 billion seed round at a $3.5 billion valuation, betting that world models - not large language models - will define the next era of AI.

AI Likely Caused Iran School Bombing That Killed 175

AI Likely Caused Iran School Bombing That Killed 175

Investigations point to outdated AI targeting data as the likely cause of the Minab girls' school airstrike that killed up to 180 people, most of them children.

Reasoning Models Can't Hide Their Thinking - OpenAI Study

Reasoning Models Can't Hide Their Thinking - OpenAI Study

OpenAI's CoT-Control benchmark shows frontier reasoning models score 0.1-15.4% at steering their own chain of thought - a result the company frames as good news for AI oversight.

CoT Control, Hidden Beliefs, and Dynamic Agent Benchmarks

CoT Control, Hidden Beliefs, and Dynamic Agent Benchmarks

New research shows reasoning models can't suppress their chain-of-thought, that they commit to answers internally long before their CoT reveals it, and that static benchmarks are inadequate for measuring real-world agent adaptability.

OpenAI Buys the Tool Used to Test Its Own Models

OpenAI Buys the Tool Used to Test Its Own Models

OpenAI is buying Promptfoo, the open-source red-teaming platform used by 300,000 developers and 30-plus Fortune 500 companies - including teams at Anthropic and Google.

Perplexity Computer Review: 19 Models, One Goal

Perplexity Computer Review: 19 Models, One Goal

Perplexity Computer orchestrates 19 AI models to run complex multi-step tasks in the background - impressive research depth, punishing credit costs.

22 Bytes Poison ML Malware Detectors via Label Spoofing

22 Bytes Poison ML Malware Detectors via Label Spoofing

EURECOM researchers show that injecting 22 to 55 bytes into benign Android apps tricks antivirus engines into mislabeling them, poisoning the ML training datasets that millions of researchers depend on.

OpenAI Kills In-Chat Checkout After Near-Zero Sales

OpenAI Kills In-Chat Checkout After Near-Zero Sales

OpenAI is pulling Instant Checkout from ChatGPT after months of near-zero purchase conversions, routing buyers to third-party apps instead.