All articles
agentic systems · intermediate ·

Workflows vs. agents: a pragmatic decision framework

A reliability-first approach to choosing between predictable workflows and flexible agents, with four reusable patterns and concrete selection heuristics.

agentsarchitectureclaudellmpatternsworkflows

The Claude API course takes a notably practical, anti-hype stance on agents. The core message: your primary goal is to solve problems reliably. Users do not care that you built a fancy agent.

The core heuristic

Workflows are predictable, testable, and debuggable — every step follows a known path. Agents are flexible combiners of tools that determine their own next action. Use workflows when you can articulate the ideal sequence; reach for an agent only when the path genuinely depends on what the model discovers at runtime.

Four reusable workflow patterns

The course presents four patterns extracted from real production systems.

Parallelization

Split a complex multi-criteria decision into independent evaluations, run them simultaneously, then aggregate.

Loading diagram…

Example: A material designer application where users upload images of parts and receive material recommendations. Instead of cramming criteria for metal, polymer, ceramic, composite, elastomer, and wood into one massive prompt, send six parallel requests — each with specialized criteria for one material — then feed all results into a final comparison step.

Benefits: Claude can concentrate on one evaluation at a time rather than juggling competing considerations. Individual prompts can be optimized independently. Adding new materials means adding a new parallel request without touching existing prompts.

When to use: Complex decisions factorable into independent sub-evaluations, where each sub-evaluation benefits from specialized criteria or tools.

Chaining

Break a task into sequential steps where each builds on the previous output.

Loading diagram…

The chaining trick for constraint-heavy prompts: When a prompt has many constraints (length limits, tone requirements, format specifications, content rules), use two steps. First call generates a draft. Second call revises the draft against the constraints. Claude can focus on one type of task at a time — generation and revision use different mental models — and the results are consistently better than a single call trying to satisfy everything at once.

When to use: Tasks with many constraints, multi-stage transformations (extract then format then validate), or any process where intermediate outputs enable quality checks before proceeding.

Routing

Classify the input, then dispatch to a specialized handler.

Loading diagram…

Example: A customer support system that categorizes incoming questions (billing, technical, account, general) and routes each to a prompt optimized for that category. Each handler can have domain-specific examples, tools, and tone guidance.

When to use: Heterogeneous input where the ideal response strategy depends on the input category.

Evaluator-optimizer

A feedback loop: generate, evaluate against a rubric, identify gaps, improve, repeat.

Loading diagram…

Example: A writing assistant that produces a draft, evaluates it against criteria (clarity, completeness, tone, accuracy), and automatically revises until the evaluation scores pass a threshold.

The evaluator step should use a separate Claude call with a separate prompt — the model grading quality should not be the same model instance that produced the content.

When to use: Quality-critical outputs where “good enough” is definable and automated evaluation is feasible.

Tool choice: a spectrum between workflow and agent

The tool_choice parameter from the Claude API provides a concrete mechanism that spans the workflow-to-agent spectrum:

Tool choice modes mapped to the workflow-agent spectrum
tool_choiceBehaviorWhere it sits
{"type": "tool", "name": "..."}Claude MUST call the named toolPure workflow — deterministic step
{"type": "any"}Claude MUST call SOME toolGuided agent — tool use required but choice is free
{"type": "auto"}Claude decides whether to call a toolFull agent — model determines its own path

Forcing tool_choice: "tool" guarantees a specific code path, useful for structured extraction where the tool’s input_schema defines your output shape. "auto" with a system prompt that says “only use tools when you need information you weren’t trained on” creates a lightweight agent that decides when external data is necessary. "any" is useful for SMS bots and other interfaces where every response must route through a tool.

The complete tool use workflow

The four-step tool use pattern from the API course instantiates a guided workflow:

  1. Client provides Claude with tools + user prompt
  2. Claude responds with stop_reason: "tool_use"
  3. Client executes the tool, returns tool_result with matching tool_use_id
  4. Claude uses the result to formulate the final answer

This loops until stop_reason is "end_turn". A while loop wrapping the API call handles any number of tool invocations — the model decides when it has enough information to answer.

When agents are the right answer

Agents become necessary when the model needs to decide for itself what to do next. The course’s guiding principle: provide abstract, combinable tools rather than hyper-specialized ones.

Loading diagram…

Good agent tools: bash, read_file, write_file, web_search, run_query — general capabilities that chain together.

Poor agent tools: refactor_code, fix_bug, optimize_performance — hyper-specialized tools that constrain flexibility and prevent creative problem-solving.

Environment inspection

The course emphasizes an overlooked design principle: Claude operates blindly. Always provide tools that let it observe results of its actions:

  • Read a file before editing it
  • Take a screenshot after a UI interaction
  • Check an API response before proceeding
  • List a directory before assuming file paths

Without observation capability, agents make decisions on stale or incorrect assumptions. Every action tool should have a corresponding observation tool.

Parallel agent orchestration: the code-review pipeline

A concrete production pattern that spans the workflow-agent spectrum: the multi-agent code review pipeline. Instead of one reviewer checking everything, three specialized agents run in parallel — each with a different architectural lens — then a fourth agent aggregates their findings.

Loading diagram…

The pipeline has eight stages:

  1. Detect changesgit diff against the base branch to identify what changed
  2. Route to specialists — Classify changes (security-sensitive paths, performance-critical code, UI components) and dispatch to the right agents
  3. Security review (parallel) — One agent checks for OWASP top 10, secret exposure, input validation, and auth bypass risks
  4. Performance review (parallel) — A second agent checks for N+1 queries, unbounded allocations, missing caching, and algorithmic complexity regressions
  5. Pattern review (parallel) — A third agent checks for project conventions, naming consistency, error handling patterns, and architectural alignment
  6. Aggregate — A fourth agent reads all three reports, deduplicates findings, resolves conflicts (two reviewers disagreeing on severity), and produces a unified report
  7. Confidence-score — Every finding gets a 0/25/50/75/100 score (the same scale used for subagent reviews), forcing the aggregator to calibrate severity rather than listing everything at equal weight
  8. Gate — If any finding scores 75+, the pipeline blocks merge with a specific explanation of what must change

The pipeline is a hybrid: stages 1-2 and 6-8 are workflows (the sequence is fixed and predictable), while stages 3-5 are agent-like (each reviewer decides for itself which files to deep-read, which patterns to flag, and how to substantiate its findings). This is the pragmatic reality: production systems rarely fit cleanly into “workflow” or “agent” — they combine both, applying each where it fits.

Workflow vs. Agent selection
FactorPrefer workflowPrefer agent
PredictabilitySteps are knownPath depends on discoveries
Reliability needsMust not deviateFlexibility is more valuable
DebuggingNeed clear failure pointsCan tolerate exploration
CostFixed number of callsVariable number of calls
LatencyParallelizable or fixedUnbounded loop

Takeaways

Known steps favor workflows

Use workflows when the path is predictable and agents only when runtime discoveries genuinely determine the next action.

Four workflow patterns cover most cases

Parallelization, chaining, routing, and evaluator-optimizer loops provide reusable structure without requiring open-ended agency.

Constraint-heavy prompts chain well

A generation call followed by a revision call lets the model focus on one mental task at a time.

Agent tools should be abstract

General tools like bash, read, write, and search compose better than narrow tools like `fix_bug` or `optimize_performance`.

Observation tools are required

Agents need ways to inspect files, screenshots, API responses, and directories so decisions are based on current state.