Red-teaming & Safety Evals

Adversarial prompting, systematic safety evaluation, eval coverage, and what gets missed.