Articles

Long-form technical articles (engineering-oriented): threat models, agent/system architecture notes, evaluation methodology, and evidence/citation policy design. Use How-to for procedures and checklists; use Reference for stable lookup pages and canonical diagrams.

Latest articles

Browse all

Newest published pages (auto-generated).

Why “Almost Human, But Not Quite” Feels Wrong: From Clowns to AI-Generated Images and Text
Two separable mechanisms behind the “something feels off” reaction: cue-level perceptual mismatch (uncanny/cue conflict) vs AI-label effects on credibility and sharing.
Published: 2026-02-25
Theory of mind in LLMs — what benchmarks test (and what they don’t)
Evidence-anchored overview of how ToM is defined in psychology, how it is operationalized for LLM evaluation, and what current results do and do not justify.
Published: 2026-02-22
Sycophancy in LLM Assistants: What It Is, How Training Creates It, and Why It Shows Up in Production
A technically grounded explanation of sycophancy (belief-agreement bias): what it is, what the evidence supports about prevalence, how preference optimization can produce it, and what changes in training and release practice reduce it.
Published: 2026-02-22
Prompt Engineering Guide for Daily Work (Deep Dive)
A deep dive into why prompts fail in daily work, how to design evidence-bounded prompt specifications (grounded outputs), and how to evaluate them.
Published: 2026-02-22
Orders of Intentionality and Recursive Mindreading Definitions and Use in LLM Evaluation
A precise reference for nested mental-state attribution (“orders of intentionality” / “recursive mindreading”) and how these constructs are operationalized in evaluations of humans and LLMs—without implying mechanism-level Theory of Mind.
Published: 2026-02-22
LLM-Led vs Orchestrator-Led Tool Execution Control-Plane Placement Tradeoffs
A control-plane placement comparison across reliability, observability, latency, cost governance, and security for tool-using LLM systems.
Published: 2026-02-22

Browse by topic

Each topic page includes: Start here (choose a goal) + all pages in the section + resources.

Start here (one per topic)

All articles

Grouped by topic; within each topic sorted by published date (newest first).

Agent security (8) Agent security (8)
Social engineering in AI systems: attacking the decision pipeline (not just people)
Threat model of social engineering against AI decision pipelines; maps prompt injection to enforcement controls outside the model (PDP/PEP, validation, budgets).
Published: 2026-02-22
Security report (client-captured): control-plane assurance failures at the LLM boundary
Client-only security report on text-only confirmations of privileged state/actions without verifiable signed audit artifacts; backend state changes not verified.
Published: 2026-02-22
Control-Plane Failure Patterns in Tool-Using LLM Systems
Two vendor-agnostic control-plane failure patterns—privilege persistence across interaction boundaries and non-enforcing integrity signals—that allow untrusted state to steer tool execution across steps.
Published: 2026-02-22
The attack surface is the orchestration loop, not the model
How multi-step orchestration (controller) loops change the threat model in tool-using systems, and where to enforce separation, authorization, validation, and budgets to reduce prompt injection, tool misuse, unsafe writes, and unbounded consumption.
Published: 2026-02-22
The attack surface starts before agents: the LLM integration trust boundary
Why agent-layer threat modeling is incomplete: the first high-leverage control point is the LLM integration trust boundary (before agent frameworks exist).
Published: 2026-02-22
Agentic Systems 8 Trust-Boundary Audit Checkpoints
A practical audit checklist of 8 trust checkpoints where untrusted artifacts can steer routing, tool use, and write-path actions in chained LLM systems.
Published: 2026-02-22
Request Assembly Threat Model (Author-Mapped): Reading the “ChatGPT Request Assembly Architecture” Diagram
A reviewer-oriented explanation of the request path (S1–S5), context sources, and R1–R8 checkpoints in an author-mapped request-assembly model.
Published: 2026-02-22
Prompt Assembly Policy Enforcement: Typed Provenance to Prevent Authority Confusion
Prevent authority confusion in prompt assembly by enforcing typed provenance separation between authoritative policy and untrusted content at ingress.
Published: 2026-02-22
Agent architecture (3) Agent architecture (3)
Model training and evaluation (5) Model training and evaluation (5)
Prompt engineering (1) Prompt engineering (1)