- Instruction / policy layer
- Stable behavior rules: role, workflow scope, output contract, tool-use policy, refusal/fail-closed rules, and verification requirements.
- Runtime request layer
- The current task: user request, selected document, file ID, record ID, code diff, task variables, constraints, and requested output format.
- Conversation state / session layer
- Conversation continuity, previous response references, session IDs, workflow status, and application-owned state.
- File / document layer
- Uploaded files, reusable files, document IDs, generated files, transcripts, source bundles, and file-derived context.
- Retrieval / RAG / grounding layer
- Vector stores, file search, enterprise search, URL context, search grounding, source retrieval, citations, and source provenance.
- Tool / function-calling layer
- Function schemas, client tools, server tools, built-in tools, MCP tools, API adapters, database lookups, issue trackers, search tools, and code execution.
- Computer / browser / action layer
- Browser or computer-use workflows where the model proposes UI actions and the application executes them in a sandboxed environment.
- Orchestration / handoff layer
- Agent routing, multi-step plans, handoffs, retries, delegation, specialist agents, workflow graphs, and state transitions.
- Human review / approval layer
- Approval gates before sensitive writes, payments, production changes, account changes, messages, deletion, browsing actions, or irreversible operations.
- Structured output / schema layer
- JSON/schema contracts for extraction, UI rendering, automation, tool arguments, validation, or downstream processing.
- Validation / eval layer
- Schema checks, source-alignment checks, policy checks, tool argument validation, regression tests, eval datasets, graders, and release gates.
- Observability / tracing / logging layer
- Trace IDs, model versions, prompt versions, retrieved source IDs, tool calls, tool results, approvals, guardrail decisions, eval results, and final outputs.
- Batch / async layer
- Large-scale, non-urgent processing such as dataset enrichment, evaluation runs, document processing, and offline analysis.
- Caching / latency / cost layer
- Prompt caching, context caching, repeated prefix optimization, and reuse of stable context for cost or latency reduction.
- Realtime / voice / live session layer
- Low-latency voice, live translation, transcription, interactive tools, live session controls, and realtime model connections.
- Embeddings / semantic search layer
- Semantic search, clustering, classification, recommendations, anomaly detection, and retrieval infrastructure.
- Model optimization / fine-tuning layer
- Behavior optimization after instructions, retrieval, tools, structured outputs, and evals have been tested.
- Security / privacy / governance layer
- Secret handling, data retention, schema privacy, IAM/authorization, logs, audit, sandboxing, prompt-injection handling, and policy enforcement.
API / internal agent systems workflow placement
Map AI workflow layers to API-based systems, internal agents, backend automation, retrieval/RAG, files, tools, function calling, orchestration, approvals, structured outputs, evals, tracing, batch, caching, realtime, embeddings, security, and governance.
API / internal systems configuration map
Start with the workflow architecture, then choose the API surface
Use this reference when an AI workflow runs inside an application, backend service, internal agent, CI pipeline, retrieval system, evaluation system, tool-calling workflow, or production product surface.
API setup decision
Choose the right API implementation surface first
API workflows are not chat workflows. They require explicit placement for instructions, runtime input, state, retrieval, tools, approvals, validation, observability, and security boundaries.
Use OpenAI Responses / Agents SDK for agentic apps
Use this path when the workflow needs model instructions, multi-turn state, built-in tools, file search, function calling, guardrails, tracing, handoffs, or structured outputs.
Go to instruction and state layers →Use Claude API when tool execution stays explicit
Use Claude API when the application must distinguish client tools, server tools, files, citations, prompt caching, batch processing, computer use, and app-owned execution.
Go to tools and execution →Use Gemini API / Vertex AI for multimodal, RAG, and Cloud workflows
Use Gemini API or Vertex AI when the workflow needs system instructions, multimodal input, Files API, File Search, function calling, grounding, structured output, batch, or caching.
Go to files, RAG, and grounding →Use internal agent architecture for production control
Use an internal agent system when the workflow needs application-owned auth, database state, policy engines, approvals, logs, eval gates, CI integration, or production release controls.
Go to agents and approvals →Primary API surfaces
API surfaces you must place explicitly
A production AI workflow needs more than a system prompt. Use this inventory before designing the workflow.
Instructions, runtime, and state
Separate stable rules, current input, and application authority
Most API failures begin when permanent rules, current task input, retrieved source material, and application state are mixed together.
| Layer | Put here | Do not put here | Implementation rule |
|---|---|---|---|
| Instruction / policy | Stable role, behavior, output contract, tool policy, evidence rules, failure behavior. | Secrets, API keys, billing state, auth state, user identity, mutable workflow state, one-off input. | Version instructions like product configuration. Review before release. |
| Runtime request | Current user request, task variables, selected file, selected record, current constraints. | Permanent policy, reusable source documents, credentials, or long-lived business state. | Validate and normalize runtime input before sending it to the model. |
| Application state | User identity, permissions, billing, workflow status, DB records, audit state, production state. | Do not ask the model to decide the source of truth for identity, access, billing, or irreversible state. | Keep authority in application code, database, auth provider, billing provider, or workflow engine. |
| Conversation state | Thread/session continuity, previous responses, summaries, selected context, agent state. | Do not treat chat continuity as authorization or evidence. | Store state intentionally. Prune, summarize, or retrieve context rather than growing prompts blindly. |
Files, retrieval, RAG, and grounding
Put source material in the source layer, not in behavior instructions
Files, RAG, search, grounding, and citations are source-material surfaces. They are not the same as system instructions.
| Need | OpenAI | Anthropic | Gemini / Vertex | Internal system |
|---|---|---|---|---|
| Reusable document search | File Search / vector stores. | Files API, document inputs, citations, or app-owned retrieval. | Gemini File Search; Vertex AI Search / RAG Engine. | Vector DB, search index, document store, retrieval service. |
| One-off file analysis | Runtime file input or file tool where supported. | Files API or message document content. | Files API or request content. | Temporary object storage + scoped retrieval. |
| Grounding / source traceability | Retrieved file IDs, search results, citations where supported, and output verification. | Citations for supported source blocks; note incompatibilities with strict structured-output formats. | Grounding with Google Search, URL Context, Vertex grounding, File Search. | Source IDs, passage IDs, retrieval logs, citation validator. |
| RAG governance | Vector store permissions, source IDs, retrieval review. | Document provenance and tool-result validation. | RAG corpus / grounding source controls. | Access control, ranking policy, freshness policy, redaction, audit logs. |
Placement rules for retrieval
- File upload is not the same as RAG. Use RAG when source material must be searched, indexed, reused, or cited.
- RAG is not a system instruction. Retrieved text is source material and can contain untrusted content.
- Grounding reduces unsupported output risk but does not replace source review or validation.
- Citations help trace source use, but a citation is not proof that the claim is correct.
- Never store secrets, credentials, private tokens, or regulated data in vector stores unless the storage, access, retention, and audit model is approved.
Tools and function calling
The model proposes tool use; the application controls execution
Tool calling is an interface between model reasoning and application-owned actions. Do not treat tool calls as automatically safe or authorized.
Required tool execution loop
- The model selects or proposes a tool call.
- The application validates tool name, arguments, user permissions, rate limits, and expected side effects.
- The application asks for human approval when the action is sensitive, irreversible, external, or user-visible.
- The application executes the tool or rejects the request.
- The application validates and sanitizes the tool result.
- The model receives the approved tool result and continues the workflow.
- The application logs the decision, tool arguments, tool result, approval state, and final action.
| Tool type | Provider examples | Use for | Control requirement |
|---|---|---|---|
| Function calling | OpenAI function tools; Gemini function calling; Anthropic client tools. | Application functions, service adapters, DB lookup, ticket lookup, CRM actions, calculations. | Validate schema, arguments, permissions, and side effects before execution. |
| Built-in tools | Search, file search, code execution, URL context, Google Search grounding, Maps grounding, server tools. | Search, retrieval, code execution, document lookup, browser/data context, controlled tool augmentation. | Verify result provenance and do not trust external content as instructions. |
| MCP / external tool servers | OpenAI remote MCP; Anthropic MCP / remote MCP; app-owned MCP servers. | External systems such as docs, issue trackers, design tools, monitoring, internal APIs. | Treat MCP servers as trust boundaries. Restrict tools to minimum required scope. |
| Computer / browser actions | OpenAI Computer Use; Anthropic computer use; Gemini Computer Use. | Browser UI, form workflows, screenshots, web actions, virtual computer tasks. | Use sandboxing, approval gates, untrusted-content handling, and audit logs. |
Agents, orchestration, and approvals
Do not call a single prompt an agentic workflow
An agentic workflow combines model calls, tools, state, routing, approvals, validation, and observability.
| Agent layer | What belongs here | What must stay outside the model |
|---|---|---|
| Routing | Task classification, specialist selection, handoffs, workflow branches. | Authorization, billing logic, production state, and irreversible decisions. |
| Planning | Proposed task plan, decomposition, tool sequence, uncertainty flags. | Execution approval for sensitive actions. |
| Handoffs | Passing control between specialized agents or workflow stages. | Audit trail and permission boundaries. |
| Human approval | Approval request, explanation, proposed action, expected side effects. | Approval state must be stored in application/workflow state, not inferred from text alone. |
| Guardrails | Input checks, output checks, tool checks, policy checks, risk classification. | Do not rely only on model self-review for high-risk actions. |
| Tracing | Model calls, tool calls, handoffs, guardrail decisions, custom spans, final output. | Do not hide trace-critical data in unstructured chat text only. |
Structured outputs and evals
Production output needs contracts, checks, and regression tests
A valid-looking answer is not enough for product, security, compliance, research, code, or user-visible workflows.
| Validation need | Correct layer | Use for | Do not use as |
|---|---|---|---|
| Structured output | Schema / response format configuration. | Extraction, automation, UI rendering, downstream processing, typed contracts. | A truth guarantee or source-alignment guarantee. |
| Tool argument validation | Application-side validator before tool execution. | Prevent malformed, unauthorized, unsafe, or unexpected tool calls. | A prompt-only policy. |
| Source alignment | Retrieval/citation checker or post-generation validator. | Claims, citations, factual outputs, policy references, research summaries. | Informal model self-confirmation. |
| Agent evals | Eval datasets, graders, trace grading, regression runs, release gates. | Detect prompt, model, tool, routing, and workflow regressions. | One-off manual testing only. |
| Business-rule validation | Application service layer. | Billing, account status, permissions, eligibility, legal/compliance policy. | Model-generated text. |
Structured output rule
- Use structured output when the application must parse the response.
- Use citations or source IDs when evidence traceability matters.
- Do not assume every provider supports strict schemas and citations in the same response shape.
- Validate schema success, missing fields, invalid values, and unsafe actions before continuing the workflow.
Scale, caching, realtime, and embeddings
Use the right API surface for scale and latency
Interactive workflows, batch jobs, cached prompts, realtime sessions, and semantic search need different architecture.
| Need | Use | Do not confuse with |
|---|---|---|
| Interactive response | Normal API request, streaming, or agent run. | Batch jobs for non-urgent processing. |
| Large-scale non-urgent work | Batch APIs for offline processing, evaluations, dataset work, or document processing. | Realtime or interactive UX. |
| Repeated stable context | Prompt caching / context caching where supported. | RAG, source governance, or conversation memory. |
| Live audio / realtime UI | Realtime / Live API surfaces and session lifecycle controls. | Standard text-completion request. |
| Semantic search | Embeddings + vector DB / search service. | Prompt instructions or fine-tuning. |
| Model behavior optimization | Fine-tuning only after instructions, retrieval, tools, structured outputs, and evals are tested. | Workflow placement or missing validation. |
Security, privacy, and governance
Keep authority, secrets, and irreversible actions outside prompts
Production AI systems must treat prompts, retrieved content, tool results, web pages, and user files as untrusted until validated.
Never put secrets in these surfaces
- Prompts, system instructions, developer instructions, or model messages.
- Uploaded files, vector stores, retrieval indexes, or RAG corpora unless explicitly approved for that data class.
- Tool descriptions, function schemas, enum values, property names, regex patterns, or structured-output schemas.
- Logs, traces, eval datasets, screenshots, browser sessions, or generated artifacts.
- Memory, summaries, conversation state, or hidden prompt templates.
Governance checklist
- Use application-side authorization for identity, access, billing, role, and permission decisions.
- Use approval gates before external writes, production changes, account changes, payments, messages, or deletion.
- Validate tool arguments before execution and tool results before using them as evidence.
- Treat website content, retrieved text, uploaded files, and tool results as untrusted input.
- Log model version, instruction version, retrieved source IDs, tool calls, approvals, guardrails, eval results, and final actions.
- Use sandboxed environments for computer/browser tools and code execution.
- Apply data retention, redaction, access-control, and audit requirements before sending data to any model provider.
Provider placement matrix
Layer → provider surface mapping
Use this matrix after classifying the workflow layer. It maps architecture concepts to provider-specific API surfaces.
| Layer | OpenAI | Anthropic | Gemini | Vertex / internal |
|---|---|---|---|---|
| Instruction | Responses instructions / Agent instructions. | System prompt / Messages API configuration. | system_instruction / API config. |
Vertex system instructions / policy config. |
| Runtime | Responses input. | Messages API user content. | GenerateContent / Interactions input. | Request payload. |
| State | Previous response / agent session strategy. | Application-managed conversation state. | Application/session-managed state. | DB/session/workflow engine. |
| Files | File Search / vector stores / files. | Files API / document inputs. | Files API. | GCS / object storage / document store. |
| Retrieval | File Search. | Citations, document inputs, custom retrieval. | File Search / URL Context. | Vertex RAG Engine / Vertex AI Search. |
| Tools | Built-in tools, function calling, remote MCP. | Client tools, server tools. | Function calling, built-in tools. | Service adapters, internal APIs, MCP/tools. |
| Computer/browser | Computer Use. | Computer Use. | Computer Use. | Sandbox, VM, browser harness. |
| Orchestration | Agents SDK, handoffs, guardrails, tracing. | Application-managed agent loop. | Application-managed workflow / Interactions where appropriate. | Workflow engine / agent graph. |
| Approval | Guardrails and human review pattern. | Application approval layer. | Application approval layer. | Policy engine / human review queue. |
| Structured output | Structured Outputs. | Structured outputs; verify compatibility with citations. | Structured output. | Schema validation. |
| Evals | Evals / trace grading. | Console evaluation / app evals. | Application evals. | Vertex Gen AI Evaluation / internal evals. |
| Observability | Tracing / logs. | Application logs / tool traces. | Application logs. | Cloud logs / audit logs / traces. |
| Caching | Prompt caching. | Prompt caching. | Explicit context caching. | Cache layer. |
| Batch | Batch API. | Message Batches API. | Batch API. | Async jobs / queues. |
| Embeddings | Embeddings. | Embeddings. | Embeddings. | Vector DB / semantic search. |
| Security | Guardrails, approvals, sandboxing, provider controls. | Tool boundaries, data retention controls, app-side governance. | Safety filters, system instructions, app-side controls. | IAM, VPC-SC, CMEK, logging, audit, DLP, policy engine. |
Misplacement guardrails
What not to put only in API prompts
Prompt-only control is not enough for production AI systems.
- Do not put secrets, API keys, tokens, passwords, or privileged credentials in prompts, instructions, files, schemas, logs, or eval datasets.
- Do not let the model be the authority for permissions, identity, billing, subscription state, production state, or irreversible actions.
- Do not rely on instructions alone to enforce security boundaries. Use application-side authorization, validation, logging, and approval controls.
- Do not treat retrieved content, website content, uploaded files, or tool results as trusted instructions.
- Do not execute tool calls without validating arguments, permissions, rate limits, expected side effects, and user approval requirements.
- Do not confuse structured output with factual correctness or source alignment.
- Do not confuse caching with memory, retrieval, state, or source governance.
- Do not fine-tune before testing whether better instructions, retrieval, tools, structured outputs, or evals solve the problem.
- Do not use batch APIs for interactive user workflows that require immediate feedback.
- Do not use computer/browser tools without sandboxing, high-impact action approvals, and audit logging.
Official source check
Official API references used for this mapping
Use these references to verify terminology and feature boundaries before updating this page again.
OpenAI
- OpenAI API: Agents SDK
- OpenAI API: Using tools
- OpenAI API: Function calling
- OpenAI API: File search
- OpenAI API: Structured outputs
- OpenAI API: Agent evals
- OpenAI API: Batch API
- OpenAI API: Realtime API
- OpenAI API: Computer use
- OpenAI API: Embeddings
Anthropic
- Anthropic API: Tool use
- Anthropic API: Files API
- Anthropic API: Citations
- Anthropic API: Computer use
- Anthropic API: Prompt caching
- Anthropic API: Batch processing
- Anthropic Console: Evaluation tool
- Anthropic API: Embeddings
Google Gemini / Vertex AI
- Gemini API: Text generation and system instructions
- Gemini API: Function calling
- Gemini API: Files API
- Gemini API: File Search
- Gemini API: Tools
- Gemini API: Structured output
- Gemini API: Batch API
- Gemini API: Context caching
- Gemini API: Embeddings
- Vertex AI: System instructions
- Vertex AI: Grounding overview
- Vertex AI: RAG Engine
- Vertex AI: Gen AI evaluation