Skip to main content
โ† All controls
LLM01 / LLM02 / LLM03 OWASP Top 10 for LLM Applications

Input + output sanitisation around the LLM

Demonstrate that the organisation has implemented technical controls to sanitise inputs before processing by LLMs and outputs before delivery to users or systems, preventing injection attacks, data leakage, and harmful content propagation.

Description

What this control does

Input and output sanitisation around Large Language Models (LLMs) involves validating, filtering, and encoding data entering and exiting the model to prevent injection attacks, data leakage, and harmful content generation. Input sanitisation removes or escapes special characters, enforces length limits, and validates schema compliance before queries reach the LLM. Output sanitisation redacts sensitive information, removes executable code or markup, and filters harmful or inappropriate content before delivery to users or downstream systems. This control is critical because LLMs can be manipulated through prompt injection, may inadvertently reproduce training data containing sensitive information, and can generate harmful or malicious content if outputs are not properly filtered.

Control objective

What auditing this proves

Demonstrate that the organisation has implemented technical controls to sanitise inputs before processing by LLMs and outputs before delivery to users or systems, preventing injection attacks, data leakage, and harmful content propagation.

Associated risks

Risks this control addresses

  • Prompt injection attacks manipulating LLM behavior to bypass security controls or access unauthorized functionality
  • Exfiltration of sensitive data through crafted prompts designed to extract information from training data or context windows
  • Generation of malicious code, scripts, or markup in LLM outputs that execute in user browsers or downstream systems
  • Cross-site scripting (XSS) or injection vulnerabilities when unsanitised LLM outputs are rendered in web applications
  • Disclosure of personally identifiable information (PII) or confidential data embedded in LLM responses
  • Jailbreak attacks using specially crafted inputs to circumvent content filtering or safety guardrails
  • Indirect prompt injection through poisoned documents or external content sources processed by the LLM

Testing procedure

How an auditor verifies this control

  1. Inventory all systems and applications integrating LLMs, documenting data flow from user input through the LLM to output delivery.
  2. Review technical specifications and code repositories to identify input validation functions applied before LLM processing, including character filtering, length limits, encoding schemes, and schema validation rules.
  3. Obtain and examine configuration files for input sanitisation libraries, web application firewalls, or API gateways that filter requests to LLM endpoints.
  4. Review output sanitisation logic that processes LLM responses, identifying content filtering rules, redaction patterns for PII or sensitive data, and encoding mechanisms for special characters.
  5. Select a sample of 10-15 diverse user interactions with LLM-enabled systems and trace input sanitisation steps through application logs or debugging output.
  6. Conduct controlled testing by submitting known malicious payloads including prompt injection attempts, script tags, SQL injection patterns, and requests for sensitive information, verifying that sanitisation blocks or neutralises these inputs.
  7. Review LLM output samples for presence of executable code, unescaped special characters, PII, or confidential data, comparing against documented filtering rules.
  8. Examine security testing reports, penetration test results, or red team exercises specifically targeting LLM input/output handling, verifying identified vulnerabilities have been remediated.
Evidence required Configuration files from input validation libraries, API gateway rules, or web application firewall policies showing sanitisation patterns and filtering logic. Source code excerpts or architectural diagrams demonstrating input validation and output filtering functions surrounding LLM integration points. Test results documentation including sanitisation rule effectiveness reports, security testing logs showing blocked malicious payloads, and sample inputs/outputs demonstrating successful filtering of injection attempts and sensitive data redaction.
Pass criteria Input sanitisation controls are actively enforced on all LLM input channels with documented validation rules, output sanitisation consistently filters or redacts sensitive data and executable content before delivery, and testing demonstrates effective blocking of prompt injection attempts and harmful content generation.

Where this control is tested

Audit programs including this control