Ai safety

Anthropic's Claude Found 22 Firefox CVEs in 14 Days

Anthropic's Claude Found 22 Firefox CVEs in 14 Days

Claude Opus 4.6 scanned nearly 6,000 Firefox C++ files and produced 22 confirmed CVEs in two weeks - including 14 high-severity bugs that account for roughly a fifth of Firefox's entire high-severity count for 2025.

CoT Control, Hidden Beliefs, and Dynamic Agent Benchmarks

CoT Control, Hidden Beliefs, and Dynamic Agent Benchmarks

New research shows reasoning models can't suppress their chain-of-thought, that they commit to answers internally long before their CoT reveals it, and that static benchmarks are inadequate for measuring real-world agent adaptability.