Agent security
Engineering writeups on trust boundaries, access control, orchestration control-flow, and policy enforcement for tool-using LLM agents.
Engineering writeups on security properties of LLM-powered agentic applications (tool-using agents): trust boundaries, authorization and access control, orchestration (control-flow mechanisms), and monitoring/policy enforcement.
Core articles
Core pages
Agentic Systems 8 Trust-Boundary Audit Checkpoints
Members only
A member article for reviewers who need a structured way to assess where untrusted content can influence chained LLM systems.
The attack surface starts before agents: the LLM integration trust boundary
Why agent-layer threat modeling is incomplete: the first high-leverage control point is the LLM integration trust boundary (before agent frameworks exist).
Web-retrieved content is a prompt-injection boundary in tool-using LLM systems
Why retrieved web content must stay non-authoritative in browsing-enabled or tool-using LLM systems, and how to keep it from steering routing, tool arguments, or side effects.
Connected apps expand the capability and authorization surface of LLM systems
Why app-connected and MCP-enabled LLM systems should be analyzed as capability, scope, approval, and side-effect control problems—not only as prompt-processing systems.
The attack surface is the orchestration loop, not the model
How multi-step orchestration (controller) loops change the threat model in tool-using systems, and where to enforce separation, authorization, validation, and budgets to reduce prompt injection, tool misuse, unsafe writes, and unbounded consumption.
Control-Plane Failure Patterns in Tool-Using LLM Systems
Members only
A member article for reviewers who need a structured way to assess how control-plane weaknesses can let untrusted state influence tool-using LLM systems across steps.
Prompt Assembly Policy Enforcement: Typed Provenance to Prevent Authority Confusion
Prevent authority confusion in prompt assembly by enforcing typed provenance separation between authoritative policy and untrusted content at ingress.
Request Assembly Threat Model (Author-Mapped): Reading the “ChatGPT Request Assembly Architecture” Diagram
Members only
A member article for reviewers who need a structured way to examine how context, authorization, tools, and state interact across request assembly.
Security report (client-captured): control-plane assurance failures at the LLM boundary
Client-only security report on text-only confirmations of privileged state/actions without verifiable signed audit artifacts; backend state changes not verified.
Social engineering in AI systems: attacking the decision pipeline (not just people)
Threat model of social engineering against AI decision pipelines; maps prompt injection to enforcement controls outside the model (PDP/PEP, validation, budgets).
Section resources
Context, reusable contracts, related links, and external baselines for this topic.
About this section About this section
Scope
- Focus: security properties of LLM-powered agentic applications (orchestration/workflows, routing/selection, policy enforcement, session boundaries & context isolation, tool invocation, write-path enforcement).
- Output style: engineering-oriented; emphasis on testable claims, explicit system boundaries, and mitigation guidance.
- Public-safe disclosure: some writeups omit PoC strings and raw evidence artifacts; request private evidence under coordinated disclosure when required.
Non-goals (out of scope for this section)
- General application security guidance that is not specific to agentic applications and orchestration/control-flow.
- Model-training security or claims about mechanism-level cognition.
Reusable contracts Reusable contracts
Mapped procedures and policies
-
Choose allowed sources for factual answers
Pick a facts-only boundary (allowed sources + refusal contract).
-
Web Verification & Citations Policy
When you cite web sources, enforce verification + citation rules.
-
Security report (client-captured): control-plane assurance failures at the LLM boundary
Client-observed artifacts vs claims requiring server-side confirmation (explicitly labeled).
-
Run the engineering quality gate — procedure
Use the engineering quality gate for structural/code correctness (not writing verification).
Suggested next
Subscription
Unlock the full version and working files
This article is public. The subscription unlocks the protected workflows, full versions, and working files across Andy's AI Playbook.