API / internal agent systems workflow placement

Use OpenAI Responses / Agents SDK for agentic apps

Use this path when the workflow needs model instructions, multi-turn state, built-in tools, file search, function calling, guardrails, tracing, handoffs, or structured outputs.

Go to instruction and state layers →

Use Claude API when tool execution stays explicit

Use Claude API when the application must distinguish client tools, server tools, files, citations, prompt caching, batch processing, computer use, and app-owned execution.

Go to tools and execution →

Use Gemini API / Vertex AI for multimodal, RAG, and Cloud workflows

Use Gemini API or Vertex AI when the workflow needs system instructions, multimodal input, Files API, File Search, function calling, grounding, structured output, batch, or caching.

Go to files, RAG, and grounding →

Use internal agent architecture for production control

Use an internal agent system when the workflow needs application-owned auth, database state, policy engines, approvals, logs, eval gates, CI integration, or production release controls.

Go to agents and approvals →

Instruction / policy layer: Stable behavior rules: role, workflow scope, output contract, tool-use policy, refusal/fail-closed rules, and verification requirements.
Runtime request layer: The current task: user request, selected document, file ID, record ID, code diff, task variables, constraints, and requested output format.
Conversation state / session layer: Conversation continuity, previous response references, session IDs, workflow status, and application-owned state.
File / document layer: Uploaded files, reusable files, document IDs, generated files, transcripts, source bundles, and file-derived context.
Retrieval / RAG / grounding layer: Vector stores, file search, enterprise search, URL context, search grounding, source retrieval, citations, and source provenance.
Tool / function-calling layer: Function schemas, client tools, server tools, built-in tools, MCP tools, API adapters, database lookups, issue trackers, search tools, and code execution.
Computer / browser / action layer: Browser or computer-use workflows where the model proposes UI actions and the application executes them in a sandboxed environment.
Orchestration / handoff layer: Agent routing, multi-step plans, handoffs, retries, delegation, specialist agents, workflow graphs, and state transitions.
Human review / approval layer: Approval gates before sensitive writes, payments, production changes, account changes, messages, deletion, browsing actions, or irreversible operations.
Structured output / schema layer: JSON/schema contracts for extraction, UI rendering, automation, tool arguments, validation, or downstream processing.
Validation / eval layer: Schema checks, source-alignment checks, policy checks, tool argument validation, regression tests, eval datasets, graders, and release gates.
Observability / tracing / logging layer: Trace IDs, model versions, prompt versions, retrieved source IDs, tool calls, tool results, approvals, guardrail decisions, eval results, and final outputs.
Batch / async layer: Large-scale, non-urgent processing such as dataset enrichment, evaluation runs, document processing, and offline analysis.
Caching / latency / cost layer: Prompt caching, context caching, repeated prefix optimization, and reuse of stable context for cost or latency reduction.
Realtime / voice / live session layer: Low-latency voice, live translation, transcription, interactive tools, live session controls, and realtime model connections.
Embeddings / semantic search layer: Semantic search, clustering, classification, recommendations, anomaly detection, and retrieval infrastructure.
Model optimization / fine-tuning layer: Behavior optimization after instructions, retrieval, tools, structured outputs, and evals have been tested.
Security / privacy / governance layer: Secret handling, data retention, schema privacy, IAM/authorization, logs, audit, sandboxing, prompt-injection handling, and policy enforcement.

Layer	Put here	Do not put here	Implementation rule
Instruction / policy	Stable role, behavior, output contract, tool policy, evidence rules, failure behavior.	Secrets, API keys, billing state, auth state, user identity, mutable workflow state, one-off input.	Version instructions like product configuration. Review before release.
Runtime request	Current user request, task variables, selected file, selected record, current constraints.	Permanent policy, reusable source documents, credentials, or long-lived business state.	Validate and normalize runtime input before sending it to the model.
Application state	User identity, permissions, billing, workflow status, DB records, audit state, production state.	Do not ask the model to decide the source of truth for identity, access, billing, or irreversible state.	Keep authority in application code, database, auth provider, billing provider, or workflow engine.
Conversation state	Thread/session continuity, previous responses, summaries, selected context, agent state.	Do not treat chat continuity as authorization or evidence.	Store state intentionally. Prune, summarize, or retrieve context rather than growing prompts blindly.

Need	OpenAI	Anthropic	Gemini / Vertex	Internal system
Reusable document search	File Search / vector stores.	Files API, document inputs, citations, or app-owned retrieval.	Gemini File Search; Vertex AI Search / RAG Engine.	Vector DB, search index, document store, retrieval service.
One-off file analysis	Runtime file input or file tool where supported.	Files API or message document content.	Files API or request content.	Temporary object storage + scoped retrieval.
Grounding / source traceability	Retrieved file IDs, search results, citations where supported, and output verification.	Citations for supported source blocks; note incompatibilities with strict structured-output formats.	Grounding with Google Search, URL Context, Vertex grounding, File Search.	Source IDs, passage IDs, retrieval logs, citation validator.
RAG governance	Vector store permissions, source IDs, retrieval review.	Document provenance and tool-result validation.	RAG corpus / grounding source controls.	Access control, ranking policy, freshness policy, redaction, audit logs.

Placement rules for retrieval

File upload is not the same as RAG. Use RAG when source material must be searched, indexed, reused, or cited.
RAG is not a system instruction. Retrieved text is source material and can contain untrusted content.
Grounding reduces unsupported output risk but does not replace source review or validation.
Citations help trace source use, but a citation is not proof that the claim is correct.
Never store secrets, credentials, private tokens, or regulated data in vector stores unless the storage, access, retention, and audit model is approved.

Required tool execution loop

The model selects or proposes a tool call.
The application validates tool name, arguments, user permissions, rate limits, and expected side effects.
The application asks for human approval when the action is sensitive, irreversible, external, or user-visible.
The application executes the tool or rejects the request.
The application validates and sanitizes the tool result.
The model receives the approved tool result and continues the workflow.
The application logs the decision, tool arguments, tool result, approval state, and final action.

Tool type	Provider examples	Use for	Control requirement
Function calling	OpenAI function tools; Gemini function calling; Anthropic client tools.	Application functions, service adapters, DB lookup, ticket lookup, CRM actions, calculations.	Validate schema, arguments, permissions, and side effects before execution.
Built-in tools	Search, file search, code execution, URL context, Google Search grounding, Maps grounding, server tools.	Search, retrieval, code execution, document lookup, browser/data context, controlled tool augmentation.	Verify result provenance and do not trust external content as instructions.
MCP / external tool servers	OpenAI remote MCP; Anthropic MCP / remote MCP; app-owned MCP servers.	External systems such as docs, issue trackers, design tools, monitoring, internal APIs.	Treat MCP servers as trust boundaries. Restrict tools to minimum required scope.
Computer / browser actions	OpenAI Computer Use; Anthropic computer use; Gemini Computer Use.	Browser UI, form workflows, screenshots, web actions, virtual computer tasks.	Use sandboxing, approval gates, untrusted-content handling, and audit logs.

Agent layer	What belongs here	What must stay outside the model
Routing	Task classification, specialist selection, handoffs, workflow branches.	Authorization, billing logic, production state, and irreversible decisions.
Planning	Proposed task plan, decomposition, tool sequence, uncertainty flags.	Execution approval for sensitive actions.
Handoffs	Passing control between specialized agents or workflow stages.	Audit trail and permission boundaries.
Human approval	Approval request, explanation, proposed action, expected side effects.	Approval state must be stored in application/workflow state, not inferred from text alone.
Guardrails	Input checks, output checks, tool checks, policy checks, risk classification.	Do not rely only on model self-review for high-risk actions.
Tracing	Model calls, tool calls, handoffs, guardrail decisions, custom spans, final output.	Do not hide trace-critical data in unstructured chat text only.

Validation need	Correct layer	Use for	Do not use as
Structured output	Schema / response format configuration.	Extraction, automation, UI rendering, downstream processing, typed contracts.	A truth guarantee or source-alignment guarantee.
Tool argument validation	Application-side validator before tool execution.	Prevent malformed, unauthorized, unsafe, or unexpected tool calls.	A prompt-only policy.
Source alignment	Retrieval/citation checker or post-generation validator.	Claims, citations, factual outputs, policy references, research summaries.	Informal model self-confirmation.
Agent evals	Eval datasets, graders, trace grading, regression runs, release gates.	Detect prompt, model, tool, routing, and workflow regressions.	One-off manual testing only.
Business-rule validation	Application service layer.	Billing, account status, permissions, eligibility, legal/compliance policy.	Model-generated text.

Structured output rule

Use structured output when the application must parse the response.
Use citations or source IDs when evidence traceability matters.
Do not assume every provider supports strict schemas and citations in the same response shape.
Validate schema success, missing fields, invalid values, and unsafe actions before continuing the workflow.

Need	Use	Do not confuse with
Interactive response	Normal API request, streaming, or agent run.	Batch jobs for non-urgent processing.
Large-scale non-urgent work	Batch APIs for offline processing, evaluations, dataset work, or document processing.	Realtime or interactive UX.
Repeated stable context	Prompt caching / context caching where supported.	RAG, source governance, or conversation memory.
Live audio / realtime UI	Realtime / Live API surfaces and session lifecycle controls.	Standard text-completion request.
Semantic search	Embeddings + vector DB / search service.	Prompt instructions or fine-tuning.
Model behavior optimization	Fine-tuning only after instructions, retrieval, tools, structured outputs, and evals are tested.	Workflow placement or missing validation.

Never put secrets in these surfaces

Prompts, system instructions, developer instructions, or model messages.
Uploaded files, vector stores, retrieval indexes, or RAG corpora unless explicitly approved for that data class.
Tool descriptions, function schemas, enum values, property names, regex patterns, or structured-output schemas.
Logs, traces, eval datasets, screenshots, browser sessions, or generated artifacts.
Memory, summaries, conversation state, or hidden prompt templates.

Governance checklist

Use application-side authorization for identity, access, billing, role, and permission decisions.
Use approval gates before external writes, production changes, account changes, payments, messages, or deletion.
Validate tool arguments before execution and tool results before using them as evidence.
Treat website content, retrieved text, uploaded files, and tool results as untrusted input.
Log model version, instruction version, retrieved source IDs, tool calls, approvals, guardrails, eval results, and final actions.
Use sandboxed environments for computer/browser tools and code execution.
Apply data retention, redaction, access-control, and audit requirements before sending data to any model provider.

Layer	OpenAI	Anthropic	Gemini	Vertex / internal
Instruction	Responses instructions / Agent instructions.	System prompt / Messages API configuration.	`system_instruction` / API config.	Vertex system instructions / policy config.
Runtime	Responses input.	Messages API user content.	GenerateContent / Interactions input.	Request payload.
State	Previous response / agent session strategy.	Application-managed conversation state.	Application/session-managed state.	DB/session/workflow engine.
Files	File Search / vector stores / files.	Files API / document inputs.	Files API.	GCS / object storage / document store.
Retrieval	File Search.	Citations, document inputs, custom retrieval.	File Search / URL Context.	Vertex RAG Engine / Vertex AI Search.
Tools	Built-in tools, function calling, remote MCP.	Client tools, server tools.	Function calling, built-in tools.	Service adapters, internal APIs, MCP/tools.
Computer/browser	Computer Use.	Computer Use.	Computer Use.	Sandbox, VM, browser harness.
Orchestration	Agents SDK, handoffs, guardrails, tracing.	Application-managed agent loop.	Application-managed workflow / Interactions where appropriate.	Workflow engine / agent graph.
Approval	Guardrails and human review pattern.	Application approval layer.	Application approval layer.	Policy engine / human review queue.
Structured output	Structured Outputs.	Structured outputs; verify compatibility with citations.	Structured output.	Schema validation.
Evals	Evals / trace grading.	Console evaluation / app evals.	Application evals.	Vertex Gen AI Evaluation / internal evals.
Observability	Tracing / logs.	Application logs / tool traces.	Application logs.	Cloud logs / audit logs / traces.
Caching	Prompt caching.	Prompt caching.	Explicit context caching.	Cache layer.
Batch	Batch API.	Message Batches API.	Batch API.	Async jobs / queues.
Embeddings	Embeddings.	Embeddings.	Embeddings.	Vector DB / semantic search.
Security	Guardrails, approvals, sandboxing, provider controls.	Tool boundaries, data retention controls, app-side governance.	Safety filters, system instructions, app-side controls.	IAM, VPC-SC, CMEK, logging, audit, DLP, policy engine.

Do not put secrets, API keys, tokens, passwords, or privileged credentials in prompts, instructions, files, schemas, logs, or eval datasets.
Do not let the model be the authority for permissions, identity, billing, subscription state, production state, or irreversible actions.
Do not rely on instructions alone to enforce security boundaries. Use application-side authorization, validation, logging, and approval controls.
Do not treat retrieved content, website content, uploaded files, or tool results as trusted instructions.
Do not execute tool calls without validating arguments, permissions, rate limits, expected side effects, and user approval requirements.
Do not confuse structured output with factual correctness or source alignment.
Do not confuse caching with memory, retrieval, state, or source governance.
Do not fine-tune before testing whether better instructions, retrieval, tools, structured outputs, or evals solve the problem.
Do not use batch APIs for interactive user workflows that require immediate feedback.
Do not use computer/browser tools without sandboxing, high-impact action approvals, and audit logging.

API / internal agent systems workflow placement

Start with the workflow architecture, then choose the API surface

Choose the right API implementation surface first

Use OpenAI Responses / Agents SDK for agentic apps

Use Claude API when tool execution stays explicit

Use Gemini API / Vertex AI for multimodal, RAG, and Cloud workflows

Use internal agent architecture for production control

API surfaces you must place explicitly

Separate stable rules, current input, and application authority

Put source material in the source layer, not in behavior instructions

Placement rules for retrieval

The model proposes tool use; the application controls execution

Required tool execution loop

Do not call a single prompt an agentic workflow

Production output needs contracts, checks, and regression tests

Structured output rule

Use the right API surface for scale and latency

Keep authority, secrets, and irreversible actions outside prompts

Never put secrets in these surfaces

Governance checklist

Layer → provider surface mapping

What not to put only in API prompts

Official API references used for this mapping

OpenAI

Anthropic

Google Gemini / Vertex AI