Skip to main content

Pro audit program · v1.0

AI Red Team & Adversarial Testing

You can not trust an AI system you have not tried to break. Audit of red-team capability covering prompt injection, jailbreaks, data exfil and grounding failures.

  • General target area
  • NIST AI RMF / OWASP LLM framework
  • 8 controls in this program
  • Cyentrix Cyentrix Trusted Author

About this program

You can not trust an AI system you have not tried to break. Audit of red-team capability covering prompt injection, jailbreaks, data exfil and grounding failures.

Risks addressed

  • Critical Jailbreak bypasses safety guardrails in production
  • Critical Prompt injection exfiltrates customer context
  • Critical Tool-using agent tricked into invoking destructive action
  • High Model hallucinates facts in customer-facing output

Controls (8)

  1. Red-team scope + rules of engagement

    High

    Red-team scope + rules of engagement

    How to test + evidence

    Testing procedure: Written ROE: targets, allowed techniques, kill switch.

    Evidence to collect: ROE document.

  2. Pre-launch adversarial testing

    Critical

    Pre-launch adversarial testing

    How to test + evidence

    Testing procedure: Every model / prompt change goes through an adversarial suite before deployment.

    Evidence to collect: Test report + sign-off.

  3. Prompt-injection corpus tested

    High

    Prompt-injection corpus tested

    How to test + evidence

    Testing procedure: Maintained corpus of injection payloads run automatically; pass criteria documented.

    Evidence to collect: Corpus + last run.

  4. Jailbreak resistance evaluations

    High

    Jailbreak resistance evaluations

    How to test + evidence

    Testing procedure: Top jailbreak templates (DAN-style, indirect, multi-turn) tested per release.

    Evidence to collect: Eval report.

  5. Tool / agent safety tests

    Critical

    Tool / agent safety tests

    How to test + evidence

    Testing procedure: Agents tested for misuse: destructive tool invocation, escalation, data egress paths.

    Evidence to collect: Test results + sandbox config.

  6. Grounding + factuality evaluations

    High

    Grounding + factuality evaluations

    How to test + evidence

    Testing procedure: RAG / factuality evals run on a representative test set; trend tracked.

    Evidence to collect: Eval scores over time.

  7. Bug-bounty / responsible disclosure for AI

    Medium

    Bug-bounty / responsible disclosure for AI

    How to test + evidence

    Testing procedure: External researchers have a clear path to report AI-specific issues.

    Evidence to collect: security.txt + intake.

  8. Post-incident review feeds the red-team backlog

    Medium

    Post-incident review feeds the red-team backlog

    How to test + evidence

    Testing procedure: Real incidents become new red-team test cases.

    Evidence to collect: Test-case provenance.