AI Agent Security Articles

Technical articles on LLM agent security, trust boundaries, prompt injection, authorization, tool-use controls, and orchestration risks.

Core pages

Gmail and WhatsApp Agents as Private-Message Execution Surfaces

A technical analysis of the security risks created when AI agents can read, interpret, route, or act on Gmail, WhatsApp, private message threads, attachments, links, and communication workflows.

Published

Connected Apps and MCP Security in LLM Systems

Security analysis of connected apps, external tools, and remote MCP servers as capability, scope, approval, disclosure, and side-effect control surfaces.

Published

Web Retrieval Prompt Injection Boundary in LLM Systems

A threat model for browsing-enabled and tool-using LLM systems where retrieved web content can steer routing, tool arguments, follow-up calls, or side effects.

Published

LLM Boundary Assurance Failures: Client-Captured Security Report

Client-only security report on text-only confirmations of privileged state or actions without verifiable signed audit artifacts. Backend state changes are not verified.

Published

AI Agent Orchestration Loop Attack Surface

How multi-step orchestration (controller) loops change the threat model in tool-using systems, and where to enforce separation, authorization, validation, and budgets to reduce prompt injection, tool misuse, unsafe writes, and unbounded consumption.

Published

Prompt Assembly Policy Enforcement for LLM Systems

An engineering guide to preventing authority confusion in prompt assembly by separating authoritative policy from untrusted content with typed provenance.

Published

Social Engineering in AI Systems and Decision Pipelines

Threat model of social engineering against AI decision pipelines; maps prompt injection to enforcement controls outside the model (PDP/PEP, validation, budgets).

Published

LLM Integration Trust Boundary Before AI Agents

Why agent-layer threat modeling is incomplete: the first high-leverage control point is the LLM integration trust boundary (before agent frameworks exist).

Published

Request Assembly Threat Model for AI Agents

A reviewer-oriented threat model for request assembly in AI assistants: what enters context, what gets prioritized or dropped, and where policy, tool, memory, retrieval, and audit checkpoints should be reviewed.

Published

Control-Plane Failure Patterns in Tool-Using LLM Systems

Two vendor-agnostic control-plane failure patterns—privilege persistence across interaction boundaries and non-enforcing integrity signals—that allow untrusted state to steer tool execution across steps.

Published

Section resources

Context, reusable contracts, related links, and external baselines for this topic.

About this section About this section

Scope

  • Focus: security properties of LLM-powered agentic applications (orchestration/workflows, routing/selection, policy enforcement, session boundaries & context isolation, tool invocation, write-path enforcement).
  • Output style: engineering-oriented; emphasis on testable claims, explicit system boundaries, and mitigation guidance.
  • Public-safe disclosure: some writeups omit PoC strings and raw evidence artifacts; request private evidence under coordinated disclosure when required.

Non-goals (out of scope for this section)

  • General application security guidance that is not specific to agentic applications and orchestration/control-flow.
  • Model-training security or claims about mechanism-level cognition.
Reusable contracts Reusable contracts

Mapped procedures and policies

External baselines External baselines