Agent security

Engineering writeups on trust boundaries, access control, orchestration control-flow, and policy enforcement for tool-using LLM agents.

Engineering writeups on security properties of LLM-powered agentic applications (tool-using agents): trust boundaries, authorization and access control, orchestration (control-flow mechanisms), and monitoring/policy enforcement.

Core articles

Core pages

Agentic Systems 8 Trust-Boundary Audit Checkpoints
A practical audit checklist of 8 trust checkpoints where untrusted artifacts can steer routing, tool use, and write-path actions in chained LLM systems.
Published: 2026-02-22
The attack surface starts before agents: the LLM integration trust boundary
Why agent-layer threat modeling is incomplete: the first high-leverage control point is the LLM integration trust boundary (before agent frameworks exist).
Published: 2026-02-22
Web-retrieved content is a prompt-injection boundary in tool-using LLM systems
Why retrieved web content must stay non-authoritative in browsing-enabled or tool-using LLM systems, and how to keep it from steering routing, tool arguments, or side effects.
Published: 2026-03-25
Connected apps expand the capability and authorization surface of LLM systems
Why app-connected and MCP-enabled LLM systems should be analyzed as capability, scope, approval, and side-effect control problems—not only as prompt-processing systems.
Published: 2026-03-30
The attack surface is the orchestration loop, not the model
How multi-step orchestration (controller) loops change the threat model in tool-using systems, and where to enforce separation, authorization, validation, and budgets to reduce prompt injection, tool misuse, unsafe writes, and unbounded consumption.
Published: 2026-02-22
Control-Plane Failure Patterns in Tool-Using LLM Systems
Two vendor-agnostic control-plane failure patterns—privilege persistence across interaction boundaries and non-enforcing integrity signals—that allow untrusted state to steer tool execution across steps.
Published: 2026-02-22
Prompt Assembly Policy Enforcement: Typed Provenance to Prevent Authority Confusion
Prevent authority confusion in prompt assembly by enforcing typed provenance separation between authoritative policy and untrusted content at ingress.
Published: 2026-02-22
Request Assembly Threat Model (Author-Mapped): Reading the “ChatGPT Request Assembly Architecture” Diagram
A reviewer-oriented explanation of the request path (S1–S5), context sources, and R1–R8 checkpoints in an author-mapped request-assembly model.
Published: 2026-02-22
Security report (client-captured): control-plane assurance failures at the LLM boundary
Client-only security report on text-only confirmations of privileged state/actions without verifiable signed audit artifacts; backend state changes not verified.
Published: 2026-02-22
Social engineering in AI systems: attacking the decision pipeline (not just people)
Threat model of social engineering against AI decision pipelines; maps prompt injection to enforcement controls outside the model (PDP/PEP, validation, budgets).
Published: 2026-02-22

Section resources

Context, reusable contracts, related links, and external baselines for this topic.

About this section About this section

Scope

  • Focus: security properties of LLM-powered agentic applications (orchestration/workflows, routing/selection, policy enforcement, session boundaries & context isolation, tool invocation, write-path enforcement).
  • Output style: engineering-oriented; emphasis on testable claims, explicit system boundaries, and mitigation guidance.
  • Public-safe disclosure: some writeups omit PoC strings and raw evidence artifacts; request private evidence under coordinated disclosure when required.

Non-goals (out of scope for this section)

  • General application security guidance that is not specific to agentic applications and orchestration/control-flow.
  • Model-training security or claims about mechanism-level cognition.
Reusable contracts Reusable contracts

Mapped procedures and policies

External baselines External baselines

Suggested next