Agent security

Engineering writeups on trust boundaries, access control, orchestration control-flow, and policy enforcement for tool-using LLM agents.

Engineering writeups on security properties of LLM-powered agentic applications (tool-using agents): trust boundaries, authorization and access control, orchestration (control-flow mechanisms), and monitoring/policy enforcement.

Core articles

Core pages

Agentic Systems 8 Trust-Boundary Audit Checkpoints

A practical audit checklist of 8 trust checkpoints where untrusted artifacts can steer routing, tool use, and write-path actions in chained LLM systems.

Published: 2026-02-22

The attack surface starts before agents: the LLM integration trust boundary

Why agent-layer threat modeling is incomplete: the first high-leverage control point is the LLM integration trust boundary (before agent frameworks exist).

Published: 2026-02-22

Web-retrieved content is a prompt-injection boundary in tool-using LLM systems

Why retrieved web content must stay non-authoritative in browsing-enabled or tool-using LLM systems, and how to keep it from steering routing, tool arguments, or side effects.

Published: 2026-03-25

Connected apps expand the capability and authorization surface of LLM systems

Why app-connected and MCP-enabled LLM systems should be analyzed as capability, scope, approval, and side-effect control problems—not only as prompt-processing systems.

Published: 2026-03-30

The attack surface is the orchestration loop, not the model

How multi-step orchestration (controller) loops change the threat model in tool-using systems, and where to enforce separation, authorization, validation, and budgets to reduce prompt injection, tool misuse, unsafe writes, and unbounded consumption.

Published: 2026-02-22

Control-Plane Failure Patterns in Tool-Using LLM Systems

Two vendor-agnostic control-plane failure patterns—privilege persistence across interaction boundaries and non-enforcing integrity signals—that allow untrusted state to steer tool execution across steps.

Published: 2026-02-22

Prompt Assembly Policy Enforcement: Typed Provenance to Prevent Authority Confusion

Prevent authority confusion in prompt assembly by enforcing typed provenance separation between authoritative policy and untrusted content at ingress.

Published: 2026-02-22

Request Assembly Threat Model (Author-Mapped): Reading the “ChatGPT Request Assembly Architecture” Diagram

A reviewer-oriented explanation of the request path (S1–S5), context sources, and R1–R8 checkpoints in an author-mapped request-assembly model.

Published: 2026-02-22

Security report (client-captured): control-plane assurance failures at the LLM boundary

Client-only security report on text-only confirmations of privileged state/actions without verifiable signed audit artifacts; backend state changes not verified.

Published: 2026-02-22

Social engineering in AI systems: attacking the decision pipeline (not just people)

Threat model of social engineering against AI decision pipelines; maps prompt injection to enforcement controls outside the model (PDP/PEP, validation, budgets).

Published: 2026-02-22

Context, reusable contracts, related links, and external baselines for this topic.

About this section About this section

Scope

Focus: security properties of LLM-powered agentic applications (orchestration/workflows, routing/selection, policy enforcement, session boundaries & context isolation, tool invocation, write-path enforcement).
Output style: engineering-oriented; emphasis on testable claims, explicit system boundaries, and mitigation guidance.
Public-safe disclosure: some writeups omit PoC strings and raw evidence artifacts; request private evidence under coordinated disclosure when required.

Non-goals (out of scope for this section)

General application security guidance that is not specific to agentic applications and orchestration/control-flow.
Model-training security or claims about mechanism-level cognition.

Reusable contracts Reusable contracts

Mapped procedures and policies

Choose allowed sources for factual answers
Pick a facts-only boundary (allowed sources + refusal contract).
Web Verification & Citations Policy
When you cite web sources, enforce verification + citation rules.
Security report (client-captured): control-plane assurance failures at the LLM boundary
Client-observed artifacts vs claims requiring server-side confirmation (explicitly labeled).
Run the engineering quality gate — procedure
Use the engineering quality gate for structural/code correctness (not writing verification).

External baselines External baselines

Agent security

Core articles

Core pages

Section resources

Scope

Non-goals (out of scope for this section)

Mapped procedures and policies

Suggested next