Workflows vs. agents: a pragmatic decision framework
A reliability-first approach to choosing between predictable workflows and flexible agents, with four reusable patterns and concrete selection heuristics.
The Claude API course takes a notably practical, anti-hype stance on agents. The core message: your primary goal is to solve problems reliably. Users do not care that you built a fancy agent.
The core heuristic
Workflows are predictable, testable, and debuggable — every step follows a known path. Agents are flexible combiners of tools that determine their own next action. Use workflows when you can articulate the ideal sequence; reach for an agent only when the path genuinely depends on what the model discovers at runtime.
Four reusable workflow patterns
The course presents four patterns extracted from real production systems.
Parallelization
Split a complex multi-criteria decision into independent evaluations, run them simultaneously, then aggregate.
Example: A material designer application where users upload images of parts and receive material recommendations. Instead of cramming criteria for metal, polymer, ceramic, composite, elastomer, and wood into one massive prompt, send six parallel requests — each with specialized criteria for one material — then feed all results into a final comparison step.
Benefits: Claude can concentrate on one evaluation at a time rather than juggling competing considerations. Individual prompts can be optimized independently. Adding new materials means adding a new parallel request without touching existing prompts.
When to use: Complex decisions factorable into independent sub-evaluations, where each sub-evaluation benefits from specialized criteria or tools.
Chaining
Break a task into sequential steps where each builds on the previous output.
The chaining trick for constraint-heavy prompts: When a prompt has many constraints (length limits, tone requirements, format specifications, content rules), use two steps. First call generates a draft. Second call revises the draft against the constraints. Claude can focus on one type of task at a time — generation and revision use different mental models — and the results are consistently better than a single call trying to satisfy everything at once.
When to use: Tasks with many constraints, multi-stage transformations (extract then format then validate), or any process where intermediate outputs enable quality checks before proceeding.
Routing
Classify the input, then dispatch to a specialized handler.
Example: A customer support system that categorizes incoming questions (billing, technical, account, general) and routes each to a prompt optimized for that category. Each handler can have domain-specific examples, tools, and tone guidance.
When to use: Heterogeneous input where the ideal response strategy depends on the input category.
Evaluator-optimizer
A feedback loop: generate, evaluate against a rubric, identify gaps, improve, repeat.
Example: A writing assistant that produces a draft, evaluates it against criteria (clarity, completeness, tone, accuracy), and automatically revises until the evaluation scores pass a threshold.
The evaluator step should use a separate Claude call with a separate prompt — the model grading quality should not be the same model instance that produced the content.
When to use: Quality-critical outputs where “good enough” is definable and automated evaluation is feasible.
Tool choice: a spectrum between workflow and agent
The tool_choice parameter from the Claude API provides a concrete mechanism that spans the workflow-to-agent spectrum:
| tool_choice | Behavior | Where it sits |
|---|---|---|
{"type": "tool", "name": "..."} | Claude MUST call the named tool | Pure workflow — deterministic step |
{"type": "any"} | Claude MUST call SOME tool | Guided agent — tool use required but choice is free |
{"type": "auto"} | Claude decides whether to call a tool | Full agent — model determines its own path |
Forcing tool_choice: "tool" guarantees a specific code path, useful for structured extraction where the tool’s input_schema defines your output shape. "auto" with a system prompt that says “only use tools when you need information you weren’t trained on” creates a lightweight agent that decides when external data is necessary. "any" is useful for SMS bots and other interfaces where every response must route through a tool.
The complete tool use workflow
The four-step tool use pattern from the API course instantiates a guided workflow:
- Client provides Claude with tools + user prompt
- Claude responds with
stop_reason: "tool_use" - Client executes the tool, returns
tool_resultwith matchingtool_use_id - Claude uses the result to formulate the final answer
This loops until stop_reason is "end_turn". A while loop wrapping the API call handles any number of tool invocations — the model decides when it has enough information to answer.
When agents are the right answer
Agents become necessary when the model needs to decide for itself what to do next. The course’s guiding principle: provide abstract, combinable tools rather than hyper-specialized ones.
Good agent tools: bash, read_file, write_file, web_search, run_query — general capabilities that chain together.
Poor agent tools: refactor_code, fix_bug, optimize_performance — hyper-specialized tools that constrain flexibility and prevent creative problem-solving.
Environment inspection
The course emphasizes an overlooked design principle: Claude operates blindly. Always provide tools that let it observe results of its actions:
- Read a file before editing it
- Take a screenshot after a UI interaction
- Check an API response before proceeding
- List a directory before assuming file paths
Without observation capability, agents make decisions on stale or incorrect assumptions. Every action tool should have a corresponding observation tool.
Parallel agent orchestration: the code-review pipeline
A concrete production pattern that spans the workflow-agent spectrum: the multi-agent code review pipeline. Instead of one reviewer checking everything, three specialized agents run in parallel — each with a different architectural lens — then a fourth agent aggregates their findings.
The pipeline has eight stages:
- Detect changes —
git diffagainst the base branch to identify what changed - Route to specialists — Classify changes (security-sensitive paths, performance-critical code, UI components) and dispatch to the right agents
- Security review (parallel) — One agent checks for OWASP top 10, secret exposure, input validation, and auth bypass risks
- Performance review (parallel) — A second agent checks for N+1 queries, unbounded allocations, missing caching, and algorithmic complexity regressions
- Pattern review (parallel) — A third agent checks for project conventions, naming consistency, error handling patterns, and architectural alignment
- Aggregate — A fourth agent reads all three reports, deduplicates findings, resolves conflicts (two reviewers disagreeing on severity), and produces a unified report
- Confidence-score — Every finding gets a 0/25/50/75/100 score (the same scale used for subagent reviews), forcing the aggregator to calibrate severity rather than listing everything at equal weight
- Gate — If any finding scores 75+, the pipeline blocks merge with a specific explanation of what must change
The pipeline is a hybrid: stages 1-2 and 6-8 are workflows (the sequence is fixed and predictable), while stages 3-5 are agent-like (each reviewer decides for itself which files to deep-read, which patterns to flag, and how to substantiate its findings). This is the pragmatic reality: production systems rarely fit cleanly into “workflow” or “agent” — they combine both, applying each where it fits.
| Factor | Prefer workflow | Prefer agent |
|---|---|---|
| Predictability | Steps are known | Path depends on discoveries |
| Reliability needs | Must not deviate | Flexibility is more valuable |
| Debugging | Need clear failure points | Can tolerate exploration |
| Cost | Fixed number of calls | Variable number of calls |
| Latency | Parallelizable or fixed | Unbounded loop |
Takeaways
Known steps favor workflows
Use workflows when the path is predictable and agents only when runtime discoveries genuinely determine the next action.
Four workflow patterns cover most cases
Parallelization, chaining, routing, and evaluator-optimizer loops provide reusable structure without requiring open-ended agency.
Constraint-heavy prompts chain well
A generation call followed by a revision call lets the model focus on one mental task at a time.
Agent tools should be abstract
General tools like bash, read, write, and search compose better than narrow tools like `fix_bug` or `optimize_performance`.
Observation tools are required
Agents need ways to inspect files, screenshots, API responses, and directories so decisions are based on current state.