Prompt Chaining in Claude Agents

In short: Prompt chaining is a decomposition strategy that splits a complex task into a fixed, predetermined sequence of steps, feeding each step's output into the next. Because the path is decided in advance, it is consistent and easy to debug, but it cannot adapt when a step uncovers something unexpected.

What prompt chaining actually is

Prompt chaining is the simplest task-decomposition strategy on the Claude Certified Architect exam, and it is the baseline every other strategy is measured against. The idea is to take one complex task, break it into a fixed sequence of smaller subtasks, and run each subtask as its own Claude call. The output of step one becomes part of the input to step two, step two feeds step three, and so on down a path you laid out in advance.

The defining word is predetermined. You, the architect, decide the steps before any model call runs. Claude does not choose what comes next; it just executes the current step well. That single property is the source of everything good and everything limiting about the pattern. Anthropic's own guidance describes this workflow as breaking a task into "fixed subtasks" along "predefined code paths," in contrast to agents that dynamically direct their own process.

Prompt chaining (fixed sequential pipeline): A decomposition strategy that divides a task into a predetermined ordered sequence of steps. Each step is an independent model call whose output is passed to the next step, with optional programmatic checks between them. The path is fixed by the developer in advance and does not change at runtime.

Why decomposing into steps beats one giant prompt

It is tempting to hand Claude the whole task in a single prompt and let it work everything out at once. For predictable work, chaining usually wins, and the reasons map directly to what the exam tests.

First, accuracy. Each step asks the model to do one well-scoped thing, so there is less to get wrong per call. A summary step that only summarises is more reliable than a single prompt asked to summarise, translate, and format simultaneously.

Second, traceability. Because every step is a discrete call with its own input and output, you can inspect exactly where a result went wrong. If the final answer is bad, you can walk back through the chain and find the step that produced the bad intermediate, instead of guessing inside one opaque generation.

Third, control between steps. The gaps between calls are yours. You can add a programmatic checkpoint, a schema check, a length gate, a regex, that verifies an intermediate result before the chain continues. Anthropic explicitly highlights these "programmatic checks" as a benefit of the chaining workflow.

fixed

the step order is decided in advance

1 step = 1 call

each subtask is its own Claude call

checkpoints

validate intermediates between steps

How a fixed pipeline runs

Mechanically, a chain is a loop your code controls, not the model. Your application calls the Messages API for step one, reads the response, optionally validates it, then constructs the prompt for step two using that output, and repeats. The model never sees the whole pipeline; it only ever sees the prompt for the step it is currently running.

This is why fixed pipelines are so consistent. The same input flows through the same steps in the same order every time, so the same kind of output comes out. For a structured, repeatable task, processing a batch of documents, running a standard review, transforming records, that determinism is exactly what you want. You can predict the cost, the number of calls, and the shape of the result before you run it.

That predictability is not merely convenient; in regulated or audited settings it is a requirement, because you must be able to show that the same input always travels the same path. A fixed pipeline gives you a reproducible trace, the input, each intermediate, each checkpoint result, and the final output, that you can store, replay, and explain after the fact. A single sprawling prompt offers no such record; its reasoning happens invisibly inside one generation. That auditability is one more reason chaining is the default for structured work that has to stand up to scrutiny.

A three-stage fixed sequential pipeline

Loading diagram...

The path is laid out in advance; checkpoints between steps validate each intermediate before the chain continues.

Designing the checkpoints between steps

The space between two steps is where a fixed pipeline earns its reliability, and it is the part beginners skip. Because each step is a discrete call, your code controls what happens before the next step starts. A checkpoint is any deterministic test you run on a step's output: does it parse as the expected shape, is it within a length bound, does it contain the fields the next step depends on? When the check passes, the chain continues; when it fails, you decide the recovery, retry the step with a clarifying instruction, fall back to a default, or halt and surface the problem rather than feeding a malformed intermediate downstream.

This is the practical difference between a chain and a single prompt. A single prompt gives you one chance to validate one final output; a chain gives you a validation point at every seam. On a document-processing flow you can confirm the extraction step produced a well-formed record before you spend a second call transforming it, so a failure is caught early and cheaply instead of corrupting everything after it. Anthropic's guidance calls these programmatic checks, and they are what let a chain stay accurate over many steps: each step starts from a verified input rather than an assumed-good one.

Checkpoints also keep the pipeline honest about cost. Because you can short-circuit the chain the moment an intermediate fails a gate, you avoid paying for the remaining steps on an input that was never going to succeed. That early-exit behaviour is only possible because the steps are separated; a single monolithic prompt has no seam at which to stop. The discipline to design the gates, not just the steps, is what separates a robust fixed pipeline from a brittle string of calls, and it is why architects describe chaining as trading a little extra orchestration code for a lot of reliability.

Treat each step's prompt as a contract

A checkpoint can only validate an intermediate if you decided in advance what that intermediate should look like, which is why the most reliable chains define each step as a contract before any code runs. The contract for a step states three things: the exact output format the next step expects, the edge cases the step must handle, and any constraints it must respect. Anthropic's guidance on chaining stresses this same discipline of keeping each step narrow and explicit, so the model returns something the next prompt and your orchestration code can consume without guesswork.

Writing the contract first changes how you build the pipeline. Instead of wiring steps together and hoping their outputs line up, you specify the shape of every seam, then make each step responsible for honouring its half of the bargain. The checkpoint becomes a literal test of that contract: does this output parse into the promised format, does it carry the declared fields, does it stay inside the stated bounds. When a step's contract is vague, its checkpoint has nothing precise to verify and malformed data slips through; when the contract is exact, the gate is trivial to write and the chain stays accurate across many steps.

Where it shines and where it breaks

A fixed pipeline is the right tool when the work is predictable and structured. Its advantages, consistency, reliability, and inspectability, are real and they compound on repetitive tasks. Code reviews with a known set of checks, document-processing flows with defined stages, and data transformations with ordered steps all fit the mould.

The limitation is the mirror image of the strength. Because the steps are predetermined, a fixed pipeline cannot adapt to unexpected findings. If step one of an investigation discovers that the real problem lives somewhere the pipeline never planned to look, there is no branch to follow it. The chain simply continues down its preset path and produces a confident but incomplete result. That is the single most common trap the exam sets here: reaching for a fixed pipeline on an open-ended task where adaptability is the whole point. When that adaptability matters, you want dynamic adaptive decomposition instead.

Worked example

A team runs a nightly job that reviews every merged pull request for the same set of policy violations and writes a structured report.

The task is a perfect fit for a fixed pipeline because the checks never change from night to night. The architect lays out three predetermined steps.

Step one takes the raw diff and extracts a normalised list of changed files and hunks. A programmatic checkpoint confirms the output parses as the expected structure before anything else runs.

Step two takes that normalised structure and asks Claude to flag the specific, known policy violations, missing tests, hard-coded secrets, disallowed dependencies. Because the step is narrow, the model is not also trying to summarise or prioritise; it just classifies against the fixed rule set.

Step three takes the flagged items and formats them into the report template the team expects, with severities and file references filled in.

Every pull request flows through the same three steps in the same order, so two similar diffs produce two similar reports. When a report once came out malformed, the team did not have to re-read an opaque generation. They looked at the step-two output, saw the classification was fine, and found the bug in the step-three formatter. That is traceability paying for itself.

The checkpoints proved their worth too. On one run a corrupted diff produced a step-one extraction that failed the structure check, so the pipeline halted at the very first seam and flagged the input instead of feeding garbage into the classification step. Because the steps are independent, the team fixed the upstream diff and re-ran only the affected job. There was no tangled single-prompt retry to unpick. Notice, finally, what the pipeline could not do: if a diff contained a brand-new class of problem the rules never described, the chain would sail past it, because adapting to the unexpected is not what a fixed pipeline is for. That boundary is not a defect; it is the deliberate price of a predictable, auditable design, and recognising it is what lets you choose the pattern deliberately rather than by habit.

Common misconceptions

Misconception

A fixed pipeline lets Claude decide what step to run next based on what it finds.

What's actually true

No. In prompt chaining the sequence of steps is predetermined by the developer. The model executes the current step but never chooses the path. Runtime, model-driven step selection is the defining feature of dynamic adaptive decomposition, which is a different strategy.

Misconception

Prompt chaining is a good default for open-ended investigation because you can just add more steps.

What's actually true

Fixed pipelines suit predictable, structured tasks. For open-ended investigation where scope is unknown until you start looking, a predetermined chain cannot branch toward what it discovers. Adding more fixed steps does not create adaptability. It just makes a longer rigid path. Use dynamic decomposition for that work.

How this shows up on the exam

Domain 1 (Agentic Architecture and Orchestration) is the most heavily weighted domain at 27%, and task statement 1.6 asks you to design decomposition strategies for complex workflows. Questions about prompt chaining rarely ask for a definition. Instead they describe a task and ask which decomposition strategy fits, and the discriminator is almost always predictability. If the scenario is a repeatable, well-structured process with stable stages, prompt chaining is the answer precisely because of its consistency. If the scenario stresses unknown scope, surprising findings, or the need to adapt, a fixed pipeline is the wrong choice and a distractor. Knowing the strength (consistency, traceability, checkpointed control) and the limitation (no adaptability) in one breath is what lets you choose correctly under exam pressure, and it sets up the contrast with choosing a decomposition strategy deliberately.

Check your understanding

An architect must automate a recurring compliance check: every incoming contract is parsed, then screened against the same fixed list of clauses, then written into a standard summary. Output must be consistent run to run and easy to audit. Which decomposition strategy best fits, and why?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

Prompt Chaining: Fixed Sequential Pipelines for Claude Agents