LLM Memory Boundary Model: Context Construction (Eligibility, Selection, Persistence) and Why Answers Change

By Tamar Peretz Published

This article is an explanation (conceptual model + design implications).
Procedure/checklist: Manage: LLM memory boundaries

Executive summary

In many production incidents, “memory problems” in LLM systems are not about storage. They are about context construction—how the system builds the model input for a given request:

When these steps are implicit, you tend to get: answer drift, non-reproducible behavior, and expanded security exposure (prompt injection + uncontrolled persistence/write-back).

Evidence boundary: vendor-specific statements in this page are limited to what OpenAI documents (see References). Everything else is a vendor-agnostic system model for engineering and auditing.

Scope and terminology

What “context construction” means (terminology used here)

Context construction is the process of constructing the input context for a single model invocation from multiple sources (user input, retrieved text, tool outputs, system/developer instructions, and optional memory/history features).
(In this repo you may also see “request assembly” used as an equivalent term; this page uses “context construction” as the primary term.)

What “memory” means operationally (terminology used here)

In production discussions, “memory” often conflates three different mechanisms:

1) Persistent memory (explicit saved items intended to carry across sessions)
2) History reference / recall (signals pulled from prior chats; not guaranteed to be complete)
3) Current-request state (the current prompt + active settings/instructions + tool/retrieval inputs)

This article keeps them separate because they fail differently and require different controls.

What OpenAI documents about ChatGPT “Memory”

OpenAI documents two independent controls that shape what can be referenced between chats:

OpenAI also documents Temporary Chat as a mode that does not reference memories and does not create new memories.

(See References.)

System model: context construction as a pipeline

This model separates inputs (candidate sources) from controls (policy + enforcement).
Operationally: sources determine what could influence the response; controls determine what may influence it (and what can persist).

1) Candidate sources (what could influence the response)

Typical candidates include:

2) Controls (what must be enforced outside the prompt)

Controls that determine eligibility/selection/persistence commonly include:

Engineering rule: prompts express intent; enforcement must be implemented as control logic (policy decision + policy enforcement), not as advisory text.

Why answers change (even with the same model)

Answer variance often comes from changes in context construction, for example:

If you cannot explain a behavior change via one of the above, treat it as an observability gap and instrument the controls (see Observability below).

Security implication: context construction is a prompt-injection surface

OpenAI safety guidance describes prompt injection as untrusted text entering a system and attempting to override intended instructions.

Operationally: any retrieved document, ticket, webpage, email, or tool output that is fed into context is untrusted input. If you also allow persistence/write-back, you additionally create a path for untrusted input to become durable state.

OWASP guidance for agentic security highlights prompt injection defenses, tool security/least-privilege, and memory/context security as core concerns.

Design anchor: decision vs enforcement (ZTA-aligned terminology)

NIST SP 800-207 (Zero Trust Architecture) describes policy decision and policy enforcement components (PDP/PEP) and uses a control-plane vs data-plane framing.

Use here (by extension):

(See References.)

Controls you can implement (vendor-agnostic)

1) Typed, scoped memory (avoid free-form blobs)

2) Persistence/write-back as a privileged operation

3) Treat retrieved text and tool outputs as untrusted inputs

4) Observability for debugging and audit

Key takeaways

Suggested reading

References

OpenAI:

OWASP:

NIST: