ChatGPT request assembly architecture

Author-mapped schematic of request assembly and context selection: gateway/policy enforcement, request assembly (selection/ordering/truncation), LLM inference, tool execution, retrieval/caching inputs, memory inputs, and observability outputs with risk checkpoints R1–R8. — ChatGPT request assembly architecture — author-mapped reference model (not vendor internal placement).

What this diagram is

This is an author-mapped reference diagram that models how a chat request can be assembled and executed across:

Policy/Gateway enforcement
Request assembly / context selection
LLM inference + tool execution
Retrieval/caching inputs
Memory inputs
Streaming/observability outputs

Provenance note: This is not a vendor-published internal architecture diagram.
According to the author, the mapping was derived from external observation (including network-level signals and repeatable behavior). This page intentionally does not include a step-by-step methodology; treat the diagram as a reference model for analysis and review.

How to read it

1) The primary request path (numbered boxes)

User Input
Gateway / Policy — enforcement (allowlist), redaction, auth binding
Request Assembly / Context Selector — pack + order + truncate
LLM Inference
Answer (with a feedback signal)

2) Planes (color legend)

Execution / Tools (purple): tool routing, multi-step chaining, connectors/external tools
Retrieval / Caching (blue): RAG/vector-store retrieval and cache/replay concepts
Memory plane (green): long-term memory, profile/bio attributes, preferences, chat history, session history
Streaming / Observability (orange): event stream + telemetry/log sink

3) Context Inputs Hub

The Context Inputs (Hub) is the conceptual merge point shown at the bottom: it represents that multiple inputs can be merged before request assembly.

Components (as labeled in the diagram)

Gateway / Policy

A control point for:

allowlisting / policy enforcement
redaction
binding authorization to the request

Request Assembly / Context Selector

A packaging step that:

selects eligible inputs,
orders them,
truncates to constraints, and produces the final model request payload.

Execution / Tools (purple)

Tool Router / Function Calling
Chaining (multi-step loop / planner retries)
Connector / External Tool

Retrieval / Caching (blue)

RAG / Vector Store
Cache / Replay Store

Memory plane (green)

Long Term Memory (saved memory store)
Bio / Profile Attributes
User Preferences
Chat History (thread messages)
Session History (session state / metadata)

Streaming / Observability (orange)

Stream Events (event bus)
Stream Logs (telemetry sink)

Risk checkpoints (R1–R8)

The red boxes mark risk checkpoints used for review. They are intended as analysis anchors (threat-model checkpoints), not as claims about any specific vendor implementation details.

R1 Context Injection (prompt-level steering)
R2 Memory Poisoning (long-term memory write)
R3 Retrieval Poisoning (RAG corpus/embeddings)
R4 Assembly Manipulation (ordering + truncation bias)
R5 Replay / Cache Confusion (stale/wrong session state)
R6 Tool Hijack (tool selection + argument injection)
R7 Stream/Log Exfiltration (prompts/outputs/user data)
R8 Profile/Preference Escalation (bio/prefs become control-plane)

Intended use

Use this page as a reference map for:

analyzing why answers change (context selection, memory inputs, retrieval inputs, tool execution),
structuring security review around the R1–R8 checkpoints,
communicating system boundaries and review scope to practitioners.

References (feature-level; not internal placement)

Memory (saved memories / chat history): https://help.openai.com/en/articles/8983136-what-is-memory
Memory FAQ: https://help.openai.com/en/articles/8590148-memory-faq
Function / tool calling: https://developers.openai.com/api/docs/guides/function-calling/
Tools overview: https://developers.openai.com/api/docs/guides/tools/
Connectors / apps in ChatGPT: https://help.openai.com/en/articles/11487775-connectors-in-chatgpt
OWASP Top 10 for LLM Applications: https://owasp.org/www-project-top-10-for-large-language-model-applications/