Ai safety

OpenAI Buys the Tool Used to Test Its Own Models

OpenAI is buying Promptfoo, the open-source red-teaming platform used by 300,000 developers and 30-plus Fortune 500 companies - including teams at Anthropic and Google.

Pro-Human AI Declaration Unites Left, Right, and Labor

A bipartisan coalition of 40+ groups - from the AFL-CIO to the Congress of Christian Leaders - released a 34-point declaration demanding human control over AI, corporate accountability, and a ban on autonomous lethal weapons.

Claude Code Wipes Production Database in Terraform Mishap

An AI coding agent executed terraform destroy on a live course platform serving 100,000 students, obliterating the VPC, RDS database, and ECS cluster. AWS restored 1.94 million rows from a hidden snapshot after 24 hours.

AI Chatbots Violate Therapy Ethics - Brown Study Finds

A Brown University study identifies 15 ethical violations across GPT, Claude, and Llama when used as mental health therapists, from crisis mishandling to deceptive empathy.

Alignment Backfires, AI Monitors Cheat, Models Resist

Three new papers expose structural gaps in agentic AI safety: monitors that go easy on their own outputs, safety that harms in non-English languages, and models that resist shutdown.

Claude Found a Fifth of Firefox's 2025 High-Severity Bugs in 2 Weeks

Anthropic's Claude Opus 4.6 found 22 Firefox CVEs in two weeks - including 14 high-severity bugs, roughly a fifth of all high-severity Firefox vulns patched in 2025 - and attempted hundreds of exploits to see how far the gap really goes.

← Previous