Agent security

Engineering writeups on security properties of LLM-powered agentic applications (tool-using agents): trust boundaries, authorization & access control, orchestration (control-flow mechanisms), and monitoring/policy enforcement.

Start here (choose a goal)

Agentic Systems 8 Trust-Boundary Audit Checkpoints

A concrete audit checklist for agent workflows (ingress → retrieval → tool calls → write paths → egress)

Next: The attack surface is the orchestration loop, not the model

Control-Plane Failure Patterns in Tool-Using LLM Systems

Failure patterns for orchestrators, routing components, and policy enforcement points

Next: Prompt Assembly Policy Enforcement: Typed Provenance to Prevent Authority Confusion

Pages in this section

Core pages

The attack surface is the orchestration loop, not the model

How multi-step orchestration (controller) loops change the threat model in tool-using systems, and where to enforce separation, authorization, validation, and budgets to reduce prompt injection, tool misuse, unsafe writes, and unbounded consumption.

The attack surface starts before agents: the LLM integration trust boundary

Why agent-layer threat modeling is incomplete: the first high-leverage control point is the LLM integration trust boundary (before agent frameworks exist).

Prompt Assembly Policy Enforcement: Typed Provenance to Prevent Authority Confusion

Prevent authority confusion in prompt assembly by enforcing typed provenance separation between authoritative policy and untrusted content at ingress.

Request Assembly Threat Model (Author-Mapped): Reading the “ChatGPT Request Assembly Architecture” Diagram

A reviewer-oriented explanation of the request path (S1–S5), context sources, and R1–R8 checkpoints in an author-mapped request-assembly model.

Security report (client-captured): control-plane assurance failures at the LLM boundary

Client-only security report on text-only confirmations of privileged state/actions without verifiable signed audit artifacts; backend state changes not verified.

Social engineering in AI systems: attacking the decision pipeline (not just people)

Threat model of social engineering against AI decision pipelines; maps prompt injection to enforcement controls outside the model (PDP/PEP, validation, budgets).

About this section About this section

Scope

Focus: security properties of LLM-powered agentic applications (orchestration/workflows, routing/selection, policy enforcement, session boundaries & context isolation, tool invocation, write-path enforcement).
Output style: engineering-oriented; emphasis on testable claims, explicit system boundaries, and mitigation guidance.
Public-safe disclosure: some writeups omit PoC strings and raw evidence artifacts; request private evidence under coordinated disclosure when required.

Non-goals (out of scope for this section)

General application security guidance that is not specific to agentic applications and orchestration/control-flow.
Model-training security or claims about mechanism-level cognition.

Reuse (contracts) Reuse (contracts)

Choose allowed sources for factual answers

Pick a facts-only boundary (allowed sources + refusal contract).

Open

Web Verification & Citations Policy

When you cite web sources, enforce verification + citation rules.

Open

Security report (client-captured): control-plane assurance failures at the LLM boundary

Client-observed artifacts vs claims requiring server-side confirmation (explicitly labeled).

Open

Run the engineering quality gate — procedure

Use the engineering quality gate for structural/code correctness (not writing verification).

Open

External baselines (shared terminology) External baselines (shared terminology)

Agent security

Start here (choose a goal)

Pages in this section

Core pages

Section resources

Scope

Non-goals (out of scope for this section)

Suggested next