
SkillsBench Shows a $1 Model With Expert Guides Beats a $15 Model Without Them
A new benchmark of 84 real-world tasks across 11 domains proves that small AI models armed with human-written step-by-step guides outperform frontier models running blind. The catch: models cannot write these guides themselves.