Adversarial robustness

LLM Jailbreak and Red-Team Resistance Leaderboard

LLM Jailbreak and Red-Team Resistance Leaderboard

Rankings of 14 frontier LLMs by adversarial robustness - how well they resist jailbreaks, prompt injection, and harmful-behavior elicitation across HarmBench, AdvBench, StrongREJECT, JailbreakBench, and AgentHarm.