PII / IP not sent to AI without classification check
Demonstrate that technical controls are in place and operating effectively to classify data for PII and IP content before it is sent to AI systems, and that transmissions containing unclassified or improperly classified sensitive data are blocked.
Description
What this control does
This control requires that all data transmitted to artificial intelligence systems, including large language models and machine learning platforms, undergoes automated classification checks prior to transmission to identify and block Personally Identifiable Information (PII) and Intellectual Property (IP). The classification mechanism intercepts AI-bound data streams, applies pattern matching and content inspection against predefined sensitive data taxonomies, and enforces conditional logic that prevents transmission when sensitive data is detected without explicit authorization. This control is critical because AI systems often store, log, and retrain on input data, and many third-party AI services lack contractual guarantees for data confidentiality, creating significant exposure risk for regulated and proprietary information.
Control objective
What auditing this proves
Demonstrate that technical controls are in place and operating effectively to classify data for PII and IP content before it is sent to AI systems, and that transmissions containing unclassified or improperly classified sensitive data are blocked.
Associated risks
Risks this control addresses
- Exfiltration of customer PII to third-party AI platforms without consent or contractual data protection agreements, violating GDPR, CCPA, or other privacy regulations
- Inadvertent disclosure of trade secrets, source code, or proprietary business strategies through AI prompt injection or conversational interfaces
- Retention and reuse of sensitive organizational data in AI model training datasets, leading to unauthorized disclosure in responses to other users
- Regulatory penalties and breach notification obligations triggered by uncontrolled transmission of protected health information (PHI), payment card data, or other regulated data types to non-compliant AI service providers
- Loss of attorney-client privilege or confidential business information through AI-assisted document review or analysis tools that transmit unredacted content externally
- Insider threats leveraging AI tools to exfiltrate sensitive data by embedding it in prompts or queries that bypass traditional data loss prevention controls
- Inadequate audit trails when classification checks fail or are bypassed, preventing detection and investigation of unauthorized sensitive data transmissions
Testing procedure
How an auditor verifies this control
- Obtain and review the data classification policy and taxonomy documentation that defines PII and IP categories applicable to AI data transmission controls.
- Identify all systems, APIs, browser extensions, and integrations that transmit data to AI platforms by reviewing asset inventories, network flow logs, and application configuration management databases.
- Examine the technical implementation of classification checks, including data loss prevention (DLP) rules, API gateway policies, proxy configurations, or custom middleware code that inspects AI-bound traffic.
- Review configuration settings for classification engines to verify that PII patterns (e.g., SSN, email, phone numbers, names) and IP indicators (e.g., confidential markings, code repositories, patent references) are defined and actively enforced.
- Select a representative sample of AI transaction logs spanning the audit period and verify that each transmission record includes a classification result, timestamp, user identifier, and disposition action (allowed, blocked, redacted).
- Conduct simulated transmission tests by attempting to send known PII and IP test datasets to configured AI endpoints through normal user channels and verify that classification checks trigger and block the transmissions with appropriate user notifications.
- Interview system administrators and data owners to confirm that classification rules are updated regularly based on new data types, regulatory changes, and AI platform additions, and review change control records for rule updates.
- Validate that exception processes exist and are documented for legitimate AI use cases requiring sensitive data transmission, including approval workflows, logging requirements, and contractual safeguards with AI vendors.
Where this control is tested