GOVERN-1.6 / MEASURE-2.11 NIST AI Risk Management Framework (AI RMF 1.0)

Bug-bounty / responsible disclosure for AI

Demonstrate that the organization operates a formalized, publicly accessible bug bounty or responsible disclosure program covering AI systems, with documented intake, triage, remediation, and communication processes that encourage external security research and timely vulnerability resolution.

Description

What this control does

A bug bounty or responsible disclosure program for AI systems provides external researchers, users, and ethical hackers with a structured, legal channel to report vulnerabilities, adversarial prompts, model manipulation techniques, data leakage issues, or unsafe outputs in deployed AI models and applications. The program defines scope (which AI systems are in-scope), safe harbor terms protecting reporters from legal action, submission procedures, triage workflows, reward structures (if applicable), and response timelines. This control is critical because AI systems exhibit emergent vulnerabilities not discoverable through traditional testing alone, and crowdsourced discovery accelerates identification of prompt injection, jailbreaks, data poisoning vectors, and alignment failures before adversaries exploit them at scale.

Control objective

What auditing this proves

Associated risks

Risks this control addresses

Undiscovered prompt injection or jailbreak techniques allowing unauthorized AI behavior or data extraction exploited by threat actors before internal teams identify them
Model inversion or membership inference attacks that leak training data or personally identifiable information remaining undetected due to lack of external testing
Legal action against well-intentioned security researchers discouraging future vulnerability disclosures and damaging organizational reputation
Adversarial input techniques (e.g., universal adversarial perturbations, poisoned context) weaponized in the wild before defensive measures are developed
Unsafe AI outputs (hallucinations, harmful content generation, bias exploitation) causing reputational or legal harm when discovered publicly without prior remediation opportunity
Delayed or inadequate response to submitted vulnerabilities leading to public disclosure before patches are deployed, increasing exploit window
Lack of visibility into real-world attack patterns and novel threat vectors targeting production AI systems that automated testing and red-teaming may miss

Testing procedure

How an auditor verifies this control

Obtain and review the published bug bounty or responsible disclosure policy document, verifying it explicitly includes AI systems, models, and APIs in the defined scope.
Verify the policy includes safe harbor language protecting researchers from legal action under computer fraud, abuse, or intellectual property statutes when acting in good faith within program rules.
Identify the public submission channels (dedicated email, web portal, platform like HackerOne or Bugcrowd) and confirm they are operational by accessing the intake mechanism.
Review the documented triage workflow, including assignment criteria, severity classification tailored to AI risks (e.g., prompt injection, model extraction, data leakage), and defined SLA timelines for acknowledgment and resolution.
Select a sample of 5-10 vulnerability submissions received in the past 12 months and trace each through intake, triage, remediation tracking, and researcher communication records.
Verify that at least one AI-specific vulnerability class (e.g., adversarial prompt, model behavior manipulation, training data extraction attempt) has been reported and documented with remediation evidence.
Interview the program owner or security team to confirm cross-functional coordination with AI engineering, data science, and legal teams for AI-related findings.
Check for evidence of program metrics reporting (submissions received, time-to-triage, time-to-remediation, rewards paid if applicable) reviewed by management quarterly or more frequently.

Evidence required Artifacts include the published bug bounty or responsible disclosure policy with AI scope definition and safe harbor terms, screenshots of the public submission portal or platform dashboard, sample vulnerability intake tickets or case records showing AI-specific findings with triage notes and remediation tracking, email or platform correspondence with researchers demonstrating acknowledgment and resolution workflows, and management review records or metrics dashboards summarizing program activity and performance against SLAs.

Pass criteria The control passes if a publicly accessible bug bounty or responsible disclosure program explicitly covering AI systems is operational, at least three qualifying AI-related vulnerability submissions in the past 12 months show documented intake and triage with evidence of remediation or risk acceptance, and defined SLAs for acknowledgment and response are met in sampled cases.

Where this control is tested

Audit programs including this control

AI Red Team & Adversarial Testing

You can not trust an AI system you have not tried to break. Audit of red-team capability covering…