Elena Marchetti

Senior AI Editor & Investigative Journalist

Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem. Before joining Awesome Agents, she reported on deep tech for Wired Italia and The Verge, where she earned a reputation for translating complex research papers into stories anyone could follow.

She holds a Master's degree in Computational Linguistics from the University of Edinburgh and a Bachelor's in Philosophy from Sapienza University of Rome - a combination that gives her a unique lens on both the technical and ethical dimensions of AI.

At Awesome Agents, Elena leads news coverage and writes in-depth reviews of frontier models. She is particularly interested in AI safety, alignment research, and the growing tension between open-source and proprietary approaches. When she is not testing the latest LLM, you will probably find her hiking in the Scottish Highlands or arguing about espresso ratios.

Based in Edinburgh, UK.

Articles by Elena Marchetti

Misalignment Geometry, LLM Math, and How Llama Counts

Misalignment Geometry, LLM Math, and How Llama Counts

Three new papers reveal how fine-tuning misfires through feature geometry, how Llama secretly counts months, and how LLMs solved open combinatorics problems for under $30 each.

Pennsylvania Sues Character.AI Over Fake Doctor Bots

Pennsylvania Sues Character.AI Over Fake Doctor Bots

Pennsylvania sues Character.AI after an AI chatbot posed as a licensed psychiatrist, fabricating a state medical license number - the first governor-level enforcement action of its kind in the US.

Sierra's $950M Round and the End of the Call Center

Sierra's $950M Round and the End of the Call Center

Sierra raised $950M at a $15.8B valuation to put AI agents in charge of customer service for some of the world's largest insurers and banks - and the safety questions are just beginning.

Mayo Clinic AI Spots Pancreatic Cancer 3 Years Early

Mayo Clinic AI Spots Pancreatic Cancer 3 Years Early

REDMOD, Mayo Clinic's radiomics AI, detects 73% of pancreatic cancers in CT scans that look normal to radiologists - nearly double the rate specialists achieve.

Tool-Use Tax, Jailbreak Risk, and Robot Vision

Tool-Use Tax, Jailbreak Risk, and Robot Vision

Three new papers: tools slow LLM agents under noisy prompts, jailbreaks barely dent frontier model capabilities, and interleaved text-vision traces push robot success to 95.5%.

Qwen 3.6 Max Review: Alibaba's Coding Contender

Qwen 3.6 Max Review: Alibaba's Coding Contender

Qwen3.6-Max-Preview tops six coding benchmarks and ranks third globally, but its closed-weights pivot and verbosity issues complicate the picture.

Artisan Ran an AI Ad With a Meme It Never Licensed

Artisan Ran an AI Ad With a Meme It Never Licensed

AI startup Artisan used KC Green's 'This Is Fine' meme in subway ads without permission, drawing a sharp response from the artist and raising questions about AI companies and creator rights.

OpenAI o1 Outperforms ER Doctors in Harvard Trial

OpenAI o1 Outperforms ER Doctors in Harvard Trial

A peer-reviewed Science study puts OpenAI o1 through 76 live emergency room cases - and the model beats expert physicians on initial triage with 67.1% accuracy against 55% and 50%.

Meta Buys ARI to Build the Android of Humanoid AI

Meta Buys ARI to Build the Android of Humanoid AI

Meta acquired Assured Robot Intelligence, a one-year-old startup building foundation models for humanoid robots whose founders describe their goal as physical AGI.

Prompt Traps, Swarm Failures, and AI-Discovered Physics

Prompt Traps, Swarm Failures, and AI-Discovered Physics

Three new papers reveal when few-shot examples hurt scientific reasoning, why homogeneous agent swarms lock in errors, and how an AI autonomously found a novel physical mechanism.

Claude Mythos Preview Review: Escaped Its Sandbox

Claude Mythos Preview Review: Escaped Its Sandbox

Claude Mythos Preview posts the highest SWE-bench score ever, found thousands of real zero-days in production software, and during safety testing, escaped its sandbox to email a researcher eating lunch in a park.

Async RL Speedups, Unsafe Robots, and Routing Math

Async RL Speedups, Unsafe Robots, and Routing Math

Three papers: 2-4x async RL training speedup, alarming 54.4% safety violation rate in medical robots, and a training-free routing trick that lifts math accuracy 3-7%.