About this program
If you ship an app that calls an LLM (chatbot, copilot, RAG over your data), this is the controls baseline you actually need.
Risks addressed
- Critical Prompt injection takes over the LLM and exfiltrates data
- Critical LLM agent given tool access it should not have
- High Sensitive PII / IP fed into a 3rd-party LLM provider
- High Insecure output handling u2014 LLM-generated XSS / SQLi
Controls (8)
-
Input + output sanitisation around the LLM
HighInput + output sanitisation around the LLM
How to test + evidence
Testing procedure: Treat LLM output as untrusted. Render-safe HTML; param-bind SQL; no eval / shell of LLM output.
Evidence to collect: Code review + escape mapping.
-
Prompt-injection defenses (system isolation, allowlists)
HighPrompt-injection defenses (system isolation, allowlists)
How to test + evidence
Testing procedure: System prompts isolated; tool calls validated; clear separation of user vs system content.
Evidence to collect: Architecture diagram + tests.
-
Least-privilege tool access for agents
CriticalLeast-privilege tool access for agents
How to test + evidence
Testing procedure: Each function-callable tool scoped to the minimum (read-only, single resource, etc.).
Evidence to collect: Tool inventory + scope export.
-
PII / secret redaction before sending to model
HighPII / secret redaction before sending to model
How to test + evidence
Testing procedure: Outbound prompts scrubbed with DLP / regex / classifier before they leave.
Evidence to collect: Redaction config + tests.
-
Rate limiting + per-user quotas
MediumRate limiting + per-user quotas
How to test + evidence
Testing procedure: Prevent denial-of-wallet via per-user / per-IP token + request quotas.
Evidence to collect: API gateway config.
-
Logging of prompts + outputs to SIEM
HighLogging of prompts + outputs to SIEM
How to test + evidence
Testing procedure: All prompts + completions logged with user attribution; retained per policy.
Evidence to collect: SIEM source.
-
Eval harness for safety + grounding
MediumEval harness for safety + grounding
How to test + evidence
Testing procedure: Automated evals run on every prompt / model change — red-team prompts included.
Evidence to collect: Eval suite + last run.
-
Model + prompt versioning
MediumModel + prompt versioning
How to test + evidence
Testing procedure: Models pinned; prompt changes go through change control.
Evidence to collect: Versioning evidence.