
Leaderboards
Do AI Benchmarks Still Matter? The Evidence for and Against Public Leaderboards
A data-driven look at benchmark contamination, leaderboard gaming, and whether public AI benchmarks can still tell us anything useful about model capabilities.