Skip to content

Selected work

Projects

A few representative builds. Filter by stack to narrow down.

  • 2026

    Reproducible benchmark measuring how published adversarial prompts perform against 2026-era LLMs and whether prompt-only defences move the needle — with cross-judge validation and bootstrap confidence intervals.

    • Python
    • Claude Sonnet 4.6
    • Llama 3.1 8B
    • Inspect AI
    • GitHub Actions
    • pytest
    • ruff
    • mypy
    View source