Skip to the content.

Abstract

Highly fluent output is not evidence of factual correctness. Modern large language models (LLMs) can produce coherent, well-structured text while still generating unsupported or incorrect claims. This note explains why that happens, using established terminology and canonical references.

Key terms (use consistently)

1) Training objective: next-token prediction does not enforce truth

Many widely used LLMs are autoregressive language models trained to predict the next token given prior context (maximum likelihood / next-token prediction). This objective is optimized for likelihood of text, not for a formal notion of truthfulness.

Implication: A response can be linguistically high-quality while still be incorrect, because “well-formed text” and “factually correct text” are different properties.

2) Hallucination is a documented failure mode in NLG (including LLMs)

Survey literature on natural language generation documents hallucination as a recurring problem across tasks (e.g., summarization, dialogue, QA), and discusses how it is measured and mitigated.

Operational framing (per-claim):

3) RLHF improves instruction-following; it does not equal verification

RLHF is a fine-tuning approach that uses human demonstrations and human preference rankings to train a reward model and optimize policy behavior. RLHF is used to improve instruction-following and preference-aligned behavior.

Constraint: RLHF does not implement claim verification. If a claim requires evidence, the pipeline must require evidence.

4) Retrieval grounding (RAG) adds evidence; provenance remains a separate requirement

Retrieval-augmented generation (RAG) combines:

RAG-style systems are motivated by limitations of purely parametric knowledge access. Evidence can be attached to outputs via retrieved passages, while provenance and update semantics remain separate system requirements.

5) From “good answer” to “verifiable answer”: operational safeguards

If accuracy matters, treat the model as a generator and enforce an explicit evidence contract.

Minimal evidence contract (drop-in)

Requirements: 1) Cite evidence for each material claim (inline citations to provided docs or retrieved passages). 2) If evidence is missing, output: INSUFFICIENT_EVIDENCE. 3) Separate claims from evidence (do not blend). 4) Keep scope and definitions explicit.

Prompt template (evidence-locked)

```text Task: Allowed evidence: <documents/URLs/snippets you provide or retrieval outputs>

Output format: