Bug-bounty / responsible disclosure for AI
Demonstrate that the organization operates a formalized, publicly accessible bug bounty or responsible disclosure program covering AI systems, with documented intake, triage, remediation, and communication processes that encourage external security research and timely vulnerability resolution.
Description
What this control does
A bug bounty or responsible disclosure program for AI systems provides external researchers, users, and ethical hackers with a structured, legal channel to report vulnerabilities, adversarial prompts, model manipulation techniques, data leakage issues, or unsafe outputs in deployed AI models and applications. The program defines scope (which AI systems are in-scope), safe harbor terms protecting reporters from legal action, submission procedures, triage workflows, reward structures (if applicable), and response timelines. This control is critical because AI systems exhibit emergent vulnerabilities not discoverable through traditional testing alone, and crowdsourced discovery accelerates identification of prompt injection, jailbreaks, data poisoning vectors, and alignment failures before adversaries exploit them at scale.
Control objective
What auditing this proves
Demonstrate that the organization operates a formalized, publicly accessible bug bounty or responsible disclosure program covering AI systems, with documented intake, triage, remediation, and communication processes that encourage external security research and timely vulnerability resolution.
Associated risks
Risks this control addresses
- Undiscovered prompt injection or jailbreak techniques allowing unauthorized AI behavior or data extraction exploited by threat actors before internal teams identify them
- Model inversion or membership inference attacks that leak training data or personally identifiable information remaining undetected due to lack of external testing
- Legal action against well-intentioned security researchers discouraging future vulnerability disclosures and damaging organizational reputation
- Adversarial input techniques (e.g., universal adversarial perturbations, poisoned context) weaponized in the wild before defensive measures are developed
- Unsafe AI outputs (hallucinations, harmful content generation, bias exploitation) causing reputational or legal harm when discovered publicly without prior remediation opportunity
- Delayed or inadequate response to submitted vulnerabilities leading to public disclosure before patches are deployed, increasing exploit window
- Lack of visibility into real-world attack patterns and novel threat vectors targeting production AI systems that automated testing and red-teaming may miss
Testing procedure
How an auditor verifies this control
- Obtain and review the published bug bounty or responsible disclosure policy document, verifying it explicitly includes AI systems, models, and APIs in the defined scope.
- Verify the policy includes safe harbor language protecting researchers from legal action under computer fraud, abuse, or intellectual property statutes when acting in good faith within program rules.
- Identify the public submission channels (dedicated email, web portal, platform like HackerOne or Bugcrowd) and confirm they are operational by accessing the intake mechanism.
- Review the documented triage workflow, including assignment criteria, severity classification tailored to AI risks (e.g., prompt injection, model extraction, data leakage), and defined SLA timelines for acknowledgment and resolution.
- Select a sample of 5-10 vulnerability submissions received in the past 12 months and trace each through intake, triage, remediation tracking, and researcher communication records.
- Verify that at least one AI-specific vulnerability class (e.g., adversarial prompt, model behavior manipulation, training data extraction attempt) has been reported and documented with remediation evidence.
- Interview the program owner or security team to confirm cross-functional coordination with AI engineering, data science, and legal teams for AI-related findings.
- Check for evidence of program metrics reporting (submissions received, time-to-triage, time-to-remediation, rewards paid if applicable) reviewed by management quarterly or more frequently.
Where this control is tested