GOVERN 1.5 / MAP 2.3 / MEASURE 2.7 NIST AI Risk Management Framework

Pre-launch adversarial testing

Demonstrate that the organization performs structured adversarial testing against systems prior to production launch and that identified vulnerabilities are remediated or accepted through formal risk acceptance processes.

Description

What this control does

Pre-launch adversarial testing subjects AI systems, applications, or products to simulated adversarial attacks before production deployment to identify vulnerabilities, failure modes, and exploitable weaknesses. This control requires documented adversarial testing protocols that include red-team exercises, fuzzing, model inversion attempts, prompt injection testing, data poisoning scenarios, or evasion techniques tailored to the system's threat model. It ensures that systems are hardened against realistic attack vectors before they are exposed to real users or adversaries, reducing the likelihood of catastrophic failure or exploitation in production.

Control objective

What auditing this proves

Associated risks

Risks this control addresses

Deployment of systems with exploitable vulnerabilities that adversaries can weaponize immediately upon launch
AI models vulnerable to adversarial examples, prompt injection, or jailbreak techniques that bypass safety controls
Data exfiltration or model inversion attacks that expose sensitive training data or intellectual property
Evasion of fraud detection, content moderation, or security classification systems through adversarial inputs
Poisoning attacks during pre-production testing phases that degrade model performance or introduce backdoors
Launch delays or reputational damage from public disclosure of vulnerabilities discovered post-deployment
Regulatory non-compliance in sectors requiring pre-deployment security validation (e.g., financial services, healthcare)

Testing procedure

How an auditor verifies this control

Obtain the organization's pre-launch adversarial testing policy, procedures, and testing methodology documentation.
Identify a representative sample of systems, applications, or AI models launched in the past 12 months.
For each sampled system, retrieve adversarial testing plans, scope definitions, threat models, and attack scenario documentation.
Review adversarial testing execution records including test dates, techniques employed (e.g., FGSM, PGD, prompt injection, fuzzing), tools used, and personnel involved.
Examine vulnerability findings reports generated from adversarial testing, including severity ratings, root cause analysis, and exploitability assessments.
Verify that identified vulnerabilities were tracked through remediation workflows or formally accepted via documented risk acceptance by authorized stakeholders.
Confirm that adversarial testing occurred prior to production launch by comparing test completion dates with deployment dates from change management records.
Interview security or AI assurance personnel to assess the independence of testing teams, depth of attack simulation, and integration with release approval processes.

Evidence required Adversarial testing policy and methodology documents; pre-launch testing plans and threat models for sampled systems; adversarial testing execution logs showing test dates, techniques, and results; vulnerability reports with findings, severity ratings, and remediation tracking; risk acceptance forms for unresolved issues signed by authorized approvers; change management records correlating testing completion with production deployment dates; screenshots of vulnerability tracking system showing closure or acceptance status.

Pass criteria All sampled systems have documented adversarial testing completed before production launch, with identified vulnerabilities either remediated with evidence of retesting or formally accepted by authorized risk owners, and testing methodology aligns with the system's documented threat model.

Where this control is tested

Audit programs including this control

AI Red Team & Adversarial Testing

You can not trust an AI system you have not tried to break. Audit of red-team capability covering…