Request Assembly Threat Model (Author-Mapped): Reading the “ChatGPT Request Assembly Architecture” Diagram

By Tamar Peretz Published

Scope & provenance (read first):
This page explains an author-mapped reference model. It is not a vendor-published internal architecture diagram.
The diagram should be treated as a review map (a stable mental model) for security analysis—not as a claim about any specific product’s internal placement.

Canonical diagram (SSOT):
See: /reference/diagrams/chatgpt-request-assembly-architecture/

Author-mapped request assembly threat model diagram showing S1–S5 request path, context inputs hub, planes (memory, retrieval/caching, tools, observability), optional persistence loops (L1–L3), and risk checkpoints (R1–R8).
Author-mapped request assembly threat model (detailed). Not a vendor internal placement diagram.

Why this model exists (for security review)

This model is a reviewer-oriented scaffold for assessing end-to-end request handling beyond the inference step. It focuses on the control points where a system:

The goal is to make these control points explicit, reviewable, and easy to audit against policy and authorization boundaries.

Evidence boundary (what is and isn’t being claimed)


At a glance (what reviewers can do with this page)

Use this page to:

This is written for security review and threat modeling, not for end-user onboarding or product marketing.

1) Confirm the variant (pre-assembly retrieval, tool-mediated retrieval, hybrid).
2) Enumerate all context inputs that can reach S3 (including session state, caches, and profile/prefs).
3) Locate binding points: where identity and authorization are bound to the request and to each tool invocation.
4) Evaluate S3 determinism: ordering rules, truncation rules, and “must-include” constraints (fail-closed vs best-effort).
5) Enumerate tools/apps/connectors: scopes, allowlists, approval policy, and argument validation.
6) Audit observability: what content is emitted (inputs/outputs/tool I/O), retention, access control, and redaction.

If you can’t obtain evidence for a step, record it explicitly as a review gap.

Artifacts to request (audit-ready evidence)

Request equivalents of the following (names vary by system):

This list is intentionally system-agnostic; treat it as a checklist for “show me the proof” during review.

Diagram recap (S1–S5)

S1 — User Input

Where untrusted instructions enter. This includes plain text, uploaded documents, and any user-controlled content that can influence downstream selection/routing.

S2 — Gateway / Policy Enforcement

A control-plane step that can enforce:

Reviewer focus: is S2 a real enforcement point (fail-closed), or mostly an annotation layer?

S3 — Request Assembly / Context Selector

A packaging step that (conceptually):

Reviewer focus: S3 is where “truth” can be lost (truncation) and where subtle bias can be introduced (ordering/precedence).

Context Inputs Hub (conceptual merge point)

In the diagram, the Context Inputs Hub represents a conceptual merge point: multiple eligible sources (memory, retrieval, cache/replay, session state, and optionally tool results) can be combined before the system performs request assembly (S3).

Reviewer focus: confirm which sources can reach the hub, what eligibility rules apply, and whether any source can influence ordering/truncation outcomes in S3.

S4 — LLM Inference

Model generates outputs and may decide to call tools (depending on your system design).

S5 — Answer

The user-visible output. Depending on system design, post-processing may occur here (filters, policy checks, formatting, safety transforms). Reviewers should confirm whether post-processing exists, what it modifies, and whether it is fail-closed.


Planes (what can influence S3 and what can happen after S4)

Memory plane (green)

For review purposes, treat memory-related inputs as two distinct mechanisms:

The diagram also breaks out memory-adjacent inputs that often behave like “memory” in practice:

Reviewer focus: which of these are treated as data vs control-plane (i.e., do they change tool permissions, system prompts, routing rules)?

Retrieval / caching (blue)

Reviewer focus: what is “retrieval eligibility” and how is provenance tracked (source, freshness, access control)?

Execution / tools (purple)

In the diagram, tool execution is triggered from S4 via a Tool Router / Function Calling stage, which can invoke Apps/Connectors / External Tools.

Chaining loop (important): the diagram models chaining as a multi-step loop that can return to S3 (re-assemble) before the next inference step. In other words: tool outputs and intermediate state can cause the system to re-run assembly (selection/ordering/truncation) for the next turn in the loop.

Tool-mediated retrieval variant: retrieval can also occur as a tool action after S4, with results returning through the tool router and (optionally) re-entering context via the hub.

Reviewer focus:

Scope note: treat apps/connectors and MCP-based integrations (Model Context Protocol; MCP) as a tool-access boundary. Reviewers should confirm:

Terminology note: Model Context Protocol (MCP) is an open specification for connecting LLM clients to external tools and resources.

Streaming / observability (orange)

In the diagram, observability is modeled as:

Reviewer focus: whether prompts, selected context, retrieved docs, tool I/O, and user data can leak into telemetry; who can read it; retention; and redaction before persistence/fan-out.


Optional persistence loops (implementation-dependent)

The diagram includes three optional persistence loops. These are implementation-dependent and must be validated in the target system before being treated as present:

Reviewer focus: if any loop exists, require explicit policy, authorization binding, and auditability for that loop.

Variants (explicitly not assumed)

This model supports multiple real-world implementations. Reviewers should confirm which variant applies:

1) Pre-assembly retrieval: retrieval happens before S3 and is merged via the Context Inputs Hub into assembly inputs.
2) Tool-mediated retrieval: retrieval happens after S4 as a tool action; results return through the tool router and may optionally re-enter assembly inputs via the hub.
3) Hybrid: both exist, with different policy controls and potentially different authorization boundaries.

This page does not assume which variant is used; the diagram is a review scaffold.


The R1–R8 risk checkpoints (reviewer-oriented)

The red boxes are analysis anchors, not claims about a vendor implementation.

Each checkpoint is written in a consistent audit template:

R1 — Context Injection (prompt-level steering)

Failure mode: Untrusted input changes behavior, routing, or policy interpretation.

Evidence to request:

Controls to verify:

R2 — Memory Poisoning (long-term memory write)

Failure mode: Untrusted content becomes persistent state and re-enters later sessions.

Evidence to request:

Controls to verify:

(Feature context: OpenAI distinguishes saved memories vs chat history in its Memory documentation.)

R3 — Retrieval Poisoning (RAG corpus / embeddings)

Failure mode: Retrieval returns attacker-controlled or low-integrity content as trusted context.

Evidence to request:

Controls to verify:

R4 — Assembly Manipulation (ordering + truncation bias)

Failure mode: Selection/ordering/truncation changes effective constraints or deletes critical context.

Evidence to request:

Controls to verify:

R5 — Replay / Cache Confusion (stale/wrong session state)

Failure mode: Wrong cached context is served, causing disclosure or authorization confusion.

Evidence to request:

Controls to verify:

R6 — Tool Hijack (tool selection + argument injection)

Failure mode: Model is induced to call unsafe tools or pass unsafe arguments.

Evidence to request:

Controls to verify:

(Feature context: OpenAI documents tools/function calling and connector/app surfaces; treat these as review surfaces, not internal placement claims.)

R7 — Stream/Log Exfiltration (prompts/outputs/tool I/O/user data)

Failure mode: Sensitive data leaks into logs/telemetry/event buses, widening exposure scope.ֿ

Evidence to request:

Controls to verify:

R8 — Profile/Preference Escalation (bio/prefs become control-plane)

Failure mode: Profile attributes/preferences change routing, tool access, or policy decisions beyond intended scope.

Evidence to request:

Controls to verify:


Minimal reviewer checklist (use with the diagram)

  1. Identify all context sources feeding S3; classify them by integrity and authorization.
  2. Confirm where authorization is bound and where it can be bypassed (S2/S3/tool/app/connector layer).
  3. Enumerate tools and apps/connectors: scopes, approval policy, and server-side enforcement.
  4. Review persistence: saved memory writes, retrieval feedback loops, cache/replay behavior, and retention.
  5. Audit observability: what is emitted (events/logs), who can read it, and how it is redacted.

Suggested reading

References (official feature docs; not internal placement claims)