The attack surface starts before agents: the LLM integration trust boundary

By Tamar Peretz Published

What this page is: a vendor-agnostic threat-modeling worksheet for the earliest trust boundary where LLM I/O can touch production systems.
What this page is not: a claim about LLM internals or any specific vendor trace.

Executive summary

Before you adopt an agent framework, many high-leverage security controls are decided at the LLM integration trust boundary: the first interface where model inputs/outputs can (a) read production data, (b) write to production systems, or (c) enter production observability (logs/telemetry/traces).

This page treats that interface as a trust boundary and maps it to OWASP GenAI LLM Top 10 (2025) risk categories.

How to use this worksheet

1) Fill the read paths table with every source that can enter context (including retrieval and tool outputs).
2) Fill the write paths table with every sink where model outputs can land (including observability and persistence).
3) For each boundary crossing, record the owner, the server-side enforcement point, and the minimum audit evidence required to reconstruct an incident.

Scope and evidence boundary

This article is a threat-modeling guide for a specific control point before agent frameworks: the earliest interface where LLM I/O can touch production data or production observability.

It does not claim mechanism-level properties about LLMs. Where risk categories are referenced, they are pinned to OWASP GenAI LLM Top 10 (2025) (see References).

Definition: the LLM integration trust boundary

LLM integration trust boundary = the first interface where LLM inputs/outputs can read from or write to:

Operationally, this is the trust boundary between model I/O and production systems.

Why this boundary matters (even without agents)

OWASP’s LLM Top 10 (2025) includes risks that apply to non-agent LLM apps when model I/O is connected to real systems:

Threat scenarios at the trust boundary (protocol-level)

Scenario A — indirect prompt injection via external content + tool access

If the system ingests untrusted content (email/docs/web) and also enables tool calls, an attacker can place instruction-like payloads in that content.

OWASP’s Excessive Agency guidance includes mailbox-assistant scenarios where untrusted inputs can trigger sensitive-data access and exfiltration. Mitigations include minimizing extensions, least-privilege scopes, and requiring user approval for high-impact actions.

Scenario B — improper output handling downstream

If model output is passed into:

Scenario C — sensitive data exposure via stored artifacts

If sensitive inputs/outputs are stored (logs/telemetry/analytics/memory/RAG indexes), the exposure surface includes retention, access control, and replay into future prompts.

Mapping worksheet

1) Read paths into the model (inputs)

Document every source that can enter context:

SourceTrust levelSensitivityTransformations before modelNotes
User messageUntrustedVariesRedaction?
Retrieved docs / webUntrustedVariesFiltering / allowlist
Tickets/CRM/email summariesUntrusted (default)Often sensitiveRedaction + minimization
Database readsTrusted (system)Often sensitiveField-level selection
Tool outputs (if re-injected)Untrusted (default)VariesSanitization + provenance tags

2) Write paths from the model (outputs)

Document where outputs can land:

SinkPersisted?Retention/TTLReadersReplay into prompts?Controls
Product UINo/YesEnd userMaybeOutput policies
Logs / telemetry / tracesYesDefined TTLOperatorsPossibleRedaction + access controls
Analytics eventsYesDefined TTLAnalystsPossibleMinimization
Memory / context storeYesDefined TTLSystemYesScoped + gated writes
Tools / internal APIsYesSystemsServer-side authz + validation
Routing / feature flagsYesSystemYesDeterministic gating

3) Owner, enforcement, and audit evidence

For each boundary crossing, record:

Minimum controls at the trust boundary (vendor-agnostic)

Control 1 — data policy for model-visible content (enforced)

Define:

Control 2 — server-side authorization and validation for any side effects

If outputs can influence tools, writes, routing, or flags:

Control 3 — treat retrieved content and tool outputs as untrusted data

Control 4 — auditability without over-collection

You should be able to reconstruct:

What you should have when done

Copy/paste checklist

Suggested reading

References (pinned)

OWASP GenAI (2025):

OWASP (legacy v1.1 — numbering differs from the GenAI 2025 list):

OWASP cheat sheets:

OpenAI:

NIST: