Add an evidence-based confidence score — procedure

Procedure for adding an evidence-based confidence score to non-sentinel answers while preserving fail-closed evidence behavior.

Purpose

This guide explains how to add an evidence-based confidence score to responses. The score is a reporting and verification layer: it communicates evidential support after the answer is checked. It is not a calibrated probability, not a model log-probability, and not a substitute for evidence. The 0–100 score is a site-level reporting convention. If an active evidence policy requires a sentinel-only fail-closed answer, the sentinel remains the entire answer and no confidence score is appended.

When to use this

Use this section to decide whether this workflow is the right fit before you configure prompts, policies, or reference material.

Use case
The final answer needs explicit confidence reporting
Use this when every non-sentinel answer must end with a standardized confidence line tied to evidence quality and uncertainty.
Use case
The workflow already uses evidence boundaries
Use this with files-only, authoritative-source, academic, or verification workflows when confidence must reflect source support rather than tone or persuasion.
Use case
The output will be reviewed, published, or used downstream
Use this when users need to understand whether the answer is strongly supported, partially supported, or limited by missing evidence.

Decision guide

Choose how to enforce confidence reporting

Use the prompt page for a reusable instruction-layer rule, the policy when adding the scoring contract to another workflow, or the Fact-Checking Kit when confidence belongs inside a broader verification workflow.

Option 1 · Reusable prompt
Use the Confidence Score prompt
Use this when the confidence line should be enforced as a reusable prompt or instruction-layer rule.
Best for: Project instructions, custom GPT/Gem/Project setup, reusable workflow configuration, or repeated verification work.
Use when: The same confidence rule should apply across repeated tasks.
Option 2 · Policy contract
Use the Confidence Score policy
Use this when the confidence rule must be merged into an existing workflow, policy stack, or response contract.
Best for: Manual policy stacks, publication workflows, or custom evidence workflows that already have their own prompts.
Use when: The workflow already exists and only needs the confidence-scoring contract added.
Option 3 · Full verification workflow
Use the Fact-Checking Kit
Use this when confidence scoring is part of a broader claim-checking and verification workflow.
Best for: High-impact factual answers, source-backed review, and outputs that need a final verification pass before acceptance.
Use when: The task requires claim verification, source checks, and confidence reporting together.

Workflow assets

Required workflow assets

Open the prompts, policies, and reference pages needed to run this workflow correctly.

Required prompt
Confidence Score prompt
Adds the reusable instruction that every non-sentinel answer ends with an evidence-based confidence score.
Type: Verification prompt
Belongs in: Instruction or verification layer
Use when: The workflow needs confidence reporting as a repeatable rule.
Required policy
Add an evidence-based confidence score
Defines the meaning, output format, thresholds, fail-closed compatibility, and evidence basis for confidence scoring.
Type: Confidence reporting policy
Controls: Confidence semantics, score format, uncertainty disclosure, thresholds, and sentinel compatibility.
Required reference
Prompt layers and policy mapping
Explains where instruction rules, source material, runtime prompts, and verification gates belong.
Type: Configuration reference
Use when: Use when configuring confidence scoring in ChatGPT, Claude, Gemini, or an internal AI system.

Implementation procedure

Step-by-step implementation procedure

Follow the workflow in order. Each step gives one action and one verification check before continuing.

  1. Instruction layer

    Configure the confidence rule

    Add the Confidence Score prompt when confidence reporting should be reused across tasks.

    Action
    Place the stable rule in the instruction layer of the tool you use, or add the policy contract to an existing workflow.
    Verify
    The rule requires a confidence line only for non-sentinel answers.
  2. Reference layer

    Keep the evidence basis explicit

    Identify the sources, artifacts, citations, policies, or verification results that support the answer.

    Action
    Tie the score to evidence quality, source coverage, conflicts, and missing information.
    Verify
    The confidence score can be explained from the evidence basis.
  3. Runtime prompt layer

    Answer the current task under the active evidence boundary

    Use the current task, source boundary, and output contract before assigning a confidence score.

    Action
    Do not let the confidence rule expand the allowed sources or override fail-closed policies.
    Verify
    The answer follows the active source boundary.
  4. Verification layer

    Check fail-closed compatibility

    If an active policy requires a sentinel-only answer, output the sentinel and stop.

    Action
    Do not append a confidence line after a sentinel-only fail-closed answer.
    Verify
    Sentinel-only answers remain exact and unchanged.
  5. Verification layer

    Add the confidence line

    For non-sentinel answers, end with the standardized confidence line.

    Action
    Use the required format and lower the score when evidence is incomplete, conflicting, outdated, or partially inspected.
    Verify
    The final line uses the configured confidence format and reflects evidence support.

Verification checklist

Use this checklist before accepting the output, publishing it, or using it as evidence for a downstream workflow.

Format
The final answer has the required confidence line
Every non-sentinel answer ends with the configured numeric confidence format.
Meaning
The score reflects evidence support
The score is tied to correctness, evidence quality, coverage, conflicts, uncertainty, and missing information.
Boundary
The score did not expand the evidence boundary
Confidence reporting does not allow unsupported claims, unstated sources, or fabricated evidence.
Fail-closed behavior
Sentinel-only outputs stay sentinel-only
When a policy requires an exact sentinel response, no confidence line is appended.
Uncertainty
Lower confidence is explained by evidence limits
When confidence is limited, the answer identifies the missing or weak evidence that prevents a higher score.

Next step