Agent security
Engineering writeups on security properties of LLM-powered agentic applications (tool-using agents): trust boundaries, authorization & access control, orchestration (control-flow mechanisms), and monitoring/policy enforcement.
Start here (choose a goal)
Agentic Systems 8 Trust-Boundary Audit Checkpoints
A concrete audit checklist for agent workflows (ingress → retrieval → tool calls → write paths → egress)
Control-Plane Failure Patterns in Tool-Using LLM Systems
Failure patterns for orchestrators, routing components, and policy enforcement points
Pages in this section
Core pages
The attack surface is the orchestration loop, not the model
How multi-step orchestration (controller) loops change the threat model in tool-using systems, and where to enforce separation, authorization, validation, and budgets to reduce prompt injection, tool misuse, unsafe writes, and unbounded consumption.
The attack surface starts before agents: the LLM integration trust boundary
Why agent-layer threat modeling is incomplete: the first high-leverage control point is the LLM integration trust boundary (before agent frameworks exist).
Prompt Assembly Policy Enforcement: Typed Provenance to Prevent Authority Confusion
Prevent authority confusion in prompt assembly by enforcing typed provenance separation between authoritative policy and untrusted content at ingress.
Request Assembly Threat Model (Author-Mapped): Reading the “ChatGPT Request Assembly Architecture” Diagram
A reviewer-oriented explanation of the request path (S1–S5), context sources, and R1–R8 checkpoints in an author-mapped request-assembly model.
Security report (client-captured): control-plane assurance failures at the LLM boundary
Client-only security report on text-only confirmations of privileged state/actions without verifiable signed audit artifacts; backend state changes not verified.
Social engineering in AI systems: attacking the decision pipeline (not just people)
Threat model of social engineering against AI decision pipelines; maps prompt injection to enforcement controls outside the model (PDP/PEP, validation, budgets).
Section resources
About this section About this section
Scope
- Focus: security properties of LLM-powered agentic applications (orchestration/workflows, routing/selection, policy enforcement, session boundaries & context isolation, tool invocation, write-path enforcement).
- Output style: engineering-oriented; emphasis on testable claims, explicit system boundaries, and mitigation guidance.
- Public-safe disclosure: some writeups omit PoC strings and raw evidence artifacts; request private evidence under coordinated disclosure when required.
Non-goals (out of scope for this section)
- General application security guidance that is not specific to agentic applications and orchestration/control-flow.
- Model-training security or claims about mechanism-level cognition.
Reuse (contracts) Reuse (contracts)
Choose allowed sources for factual answers
Pick a facts-only boundary (allowed sources + refusal contract).
Web Verification & Citations Policy
When you cite web sources, enforce verification + citation rules.
Security report (client-captured): control-plane assurance failures at the LLM boundary
Client-observed artifacts vs claims requiring server-side confirmation (explicitly labeled).
Run the engineering quality gate — procedure
Use the engineering quality gate for structural/code correctness (not writing verification).