LLM Application Security

About this program

If you ship an app that calls an LLM (chatbot, copilot, RAG over your data), this is the controls baseline you actually need.

Risks addressed

Critical Prompt injection takes over the LLM and exfiltrates data
Critical LLM agent given tool access it should not have
High Sensitive PII / IP fed into a 3rd-party LLM provider
High Insecure output handling u2014 LLM-generated XSS / SQLi

Controls (8)

Input + output sanitisation around the LLM
High

Input + output sanitisation around the LLM

How to test + evidence

Testing procedure: Treat LLM output as untrusted. Render-safe HTML; param-bind SQL; no eval / shell of LLM output.

Evidence to collect: Code review + escape mapping.
Prompt-injection defenses (system isolation, allowlists)
High

Prompt-injection defenses (system isolation, allowlists)

How to test + evidence

Testing procedure: System prompts isolated; tool calls validated; clear separation of user vs system content.

Evidence to collect: Architecture diagram + tests.
Least-privilege tool access for agents
Critical

Least-privilege tool access for agents

How to test + evidence

Testing procedure: Each function-callable tool scoped to the minimum (read-only, single resource, etc.).

Evidence to collect: Tool inventory + scope export.
PII / secret redaction before sending to model
High

PII / secret redaction before sending to model

How to test + evidence

Testing procedure: Outbound prompts scrubbed with DLP / regex / classifier before they leave.

Evidence to collect: Redaction config + tests.
Rate limiting + per-user quotas
Medium

Rate limiting + per-user quotas

How to test + evidence

Testing procedure: Prevent denial-of-wallet via per-user / per-IP token + request quotas.

Evidence to collect: API gateway config.
Logging of prompts + outputs to SIEM
High

Logging of prompts + outputs to SIEM

How to test + evidence

Testing procedure: All prompts + completions logged with user attribution; retained per policy.

Evidence to collect: SIEM source.
Eval harness for safety + grounding
Medium

Eval harness for safety + grounding

How to test + evidence

Testing procedure: Automated evals run on every prompt / model change — red-team prompts included.

Evidence to collect: Eval suite + last run.
Model + prompt versioning
Medium

Model + prompt versioning

How to test + evidence

Testing procedure: Models pinned; prompt changes go through change control.

Evidence to collect: Versioning evidence.