Dynamic Subagent Selection Explained

In short: Dynamic subagent selection is a coordinator pattern where the lead agent analyses each incoming query and invokes only the subagents that query actually needs, rather than running every request through a fixed pipeline. Anthropic scales this from a single subagent for simple fact-finding to ten or more for complex research, matching effort to query complexity.

What dynamic subagent selection actually means

Dynamic subagent selection is the decision a coordinator makes before it does any real work: it reads the incoming query, reasons about what that particular request needs, and then invokes only the subagents required to satisfy it. The word that matters is dynamic. The set of subagents is not fixed in advance by the developer; it is chosen at runtime, per query, by the coordinating model itself. A request for a single fact might trigger one subagent or none, while a sprawling research question might fan out to many.

Contrast this with the design it replaces. In a fixed pipeline, the developer wires the same chain of subagents to run on every request, search, then summarise, then verify, then cite, regardless of whether a given query needs all four stages. That feels safe because no step is ever skipped, but it is exactly the pattern this knowledge point teaches you to avoid. The coordinator that selects dynamically does less work on easy queries and more on hard ones, and that asymmetry is where the efficiency comes from.

Dynamic subagent selection: A coordinator pattern in which the lead agent analyses each query and invokes only the subagents that query requires, rather than passing every request through a fixed sequence of agents.

Why a fixed full pipeline is the wrong default

The exam trap for this knowledge point is blunt: always invoking every subagent regardless of query requirements. It is worth being precise about why that is a trap, because on the exam you will be asked to diagnose a slow or expensive system and identify the design flaw underneath it.

Two costs stack up. The first is latency. Multi-agent pipelines are often sequential, where one agent's output feeds the next, so the wall-clock time is the sum of every stage. Force a five-stage pipeline onto a query that needed one stage and you have multiplied the response time for no benefit. The second is token cost. Every subagent carries its own context and system prompt, and the coordinator pays to pass work in and read results back out. A multi-agent run can consume many times the tokens of a single call, so spending that budget on subagents a query never needed is pure waste.

Anthropic hit this directly when building its own research system. Early versions, as their engineering team documented, "made errors like spawning 50 subagents for simple queries, scouring the web endlessly for nonexistent sources." The fix was not a clever model change; it was teaching the coordinator to size its effort to the request. That lesson is the heart of this KP.

1 agent

simple fact-finding

2-4

direct comparisons

10+

complex research only

How dynamic subagent selection works inside a coordinator

The mechanism has a clean shape. When a query arrives, the coordinator does not immediately delegate. It first analyses the request to extract its requirements: what kind of information is needed, how many distinct threads of work it implies, and which of the available specialists can supply each piece. Only then does it choose a subset of subagents to spawn, hand each one a focused brief, collect their results, and synthesise a final answer.

Anthropic's own description of the orchestrator-workers workflow captures this precisely: "a central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results." The detail that distinguishes it from a simple parallel fan-out is flexibility, the subtasks "aren't pre-defined, but determined by the orchestrator based on the specific input." That single sentence is the difference between a fixed pipeline and dynamic selection, and it is the idea the exam is testing.

The subagent description is the routing signal

How does the coordinator actually pick? It matches the query against what each subagent advertises about itself. In Claude Code, the documentation is explicit that "Claude uses each subagent's description to decide when to delegate tasks," and it instructs authors to "write a clear description so Claude knows when to use it." The description field is therefore not documentation for humans; it is the selection input the coordinator reads at runtime. A subagent whose description is vague or overlaps with another's will be chosen unreliably, which is a failure that traces straight back to the coordinator's configuration rather than to the subagent's own logic. This is why dynamic selection depends on the discipline you learn in coordinator responsibilities: the coordinator owns both the analysis and the routing.

Because each subagent runs in its own context window, selection is also a context-management decision. Spawning a subagent the query did not need does not just cost tokens; it adds a separate context that must be briefed, run, and folded back in. Choosing the right small set keeps the main conversation clean, which is the whole reason hub-and-spoke architecture puts a single coordinator at the centre.

Scaling effort to query complexity

The most testable concrete guidance comes from Anthropic's research system, which ties subagent count to query complexity with explicit numbers: "Simple fact-finding requires just 1 agent with 3-10 tool calls, direct comparisons might need 2-4 subagents with 10-15 calls each, and complex research might use more than 10 subagents with clearly divided responsibilities." Those bands exist to stop the lead agent from over-investing in easy work, which the team called "a common failure mode in our early versions."

The principle generalises beyond those exact figures. Under-spawning starves a genuinely complex query of the parallel exploration it needs, so a hard research question handled by one subagent returns thin, shallow results. Over-spawning, the more common mistake, throws subagents at a question that never needed them and pays the latency and token bill anyway. Dynamic subagent selection is the practice of landing between those failure modes by reading the query before committing resources. It is also why over-decomposing a task into too many narrow subagents is its own hazard, covered in narrow decomposition failure.

How a coordinator selects subagents per query

Loading diagram...

The coordinator branches on what each query needs, so a simple fact skips the subagents entirely while a richer query fans out to only the relevant specialists.

A worked example: a research coordinator triaging three queries

Worked example

A research assistant is built as a coordinator with four specialised subagents: web-search, code-search, data-lookup, and citation-check. Three different user queries arrive in the same session.

The first query is a plain fact: "What year was the Messages API released?" The coordinator analyses it and concludes it needs at most one web search and no code, no internal data, and no citation pass. So it spawns the web-search subagent alone, takes the single result, and answers. One subagent, a few tool calls, a fast and cheap response. A fixed pipeline would have run all four stages and made the user wait through a code search and a citation check that had nothing to contribute.

The second query is a comparison: "How does our retrieval service's latency compare to the figures in the vendor's published benchmark?" Now the requirements are richer. The coordinator sees two distinct threads, internal numbers and an external published source, and selects two subagents: data-lookup for the service metrics and web-search for the vendor benchmark. It runs them in parallel, then synthesises the comparison itself. Two subagents, chosen because the query genuinely contained two requirements, not because a pipeline mandated them.

The third query is open-ended: "Produce a sourced overview of how multi-agent systems handle error recovery, with examples from at least three frameworks." This is the case where fanning out pays off. The coordinator decomposes it into several investigation threads and spawns multiple web-search subagents on different framework families, plus the citation-check subagent at the end to verify sources. Here a larger subagent count is justified by the complexity of the request, exactly as the scaling bands predict.

The lesson lives in the contrast. The same coordinator, with the same four subagents available, did wildly different amounts of work on the three queries, one subagent, then two, then many, because it read each query first. That is dynamic subagent selection in practice: the architecture is fixed, but the routing is decided fresh every time. When the synthesis step still needs to recover from a subagent that came back empty, that recovery is handled by multi-agent error handling and routing.

Common misconceptions

Misconception

A coordinator should pass every query through all of its subagents so that no useful step is ever skipped.

What's actually true

That is a fixed pipeline, not dynamic selection, and it is the anti-pattern this KP targets. Running every subagent on every request multiplies latency and token cost while adding nothing when the query needed only one or two. The coordinator should analyse the query and invoke only the subagents it requires.

Misconception

Dynamic selection means spawning as many subagents as possible so the answer is more thorough.

What's actually true

More subagents is not better. Anthropic found early systems spawning fifty subagents for simple queries and wasting resources. Subagent count should scale to query complexity: one for simple fact-finding, two to four for comparisons, and ten or more only for genuinely complex research.

Misconception

The coordinator chooses subagents in a fixed order the developer hard-codes, so the order is what determines selection.

What's actually true

Selection is determined by query analysis matched against each subagent description, not by a hard-wired order. The description field is the signal the coordinator reads to decide which subagent fits the task, so the choice is driven by the input. A hard-coded order is just a pipeline by another name.

How this is tested on the exam

This knowledge point sits in Domain 1, the most heavily weighted domain at 27% of the exam, under task statement 1.2 on orchestrating multi-agent systems with coordinator-subagent patterns. It is an apply-level KP, so questions will not ask you to recite a definition. They will hand you a scenario, most often the Multi-Agent Research System or the Developer Productivity scenario, and describe a system that is slow, expensive, or thin, then ask you to choose the design change that fixes it.

The reliable tell is a coordinator running a full pipeline on every request. When the symptom is wasted latency and cost on simple queries, the answer is to make the coordinator analyse the query and select only the subagents that query needs. A closely related exam principle is that coverage and routing failures trace back to the coordinator, not the individual subagents, if the wrong specialist keeps getting picked, the fix is in the coordinator's analysis and the subagent descriptions it routes on, which connects this KP to iterative refinement in multi-agent systems. Hold onto the single idea that selection is decided per query from the input, and the distractors that propose tuning unrelated knobs fall away.

Check your understanding

A research coordinator is built with four specialised subagents: web-search, code-search, data-lookup, and citation-check. A user asks a simple factual question that only needs one web search, but the implementation routes every query through all four subagents in sequence. Users complain the assistant is slow and expensive. What is the most appropriate fix?

Key takeaways

Dynamic subagent selection is the discipline of choosing, per query, which subagents to invoke instead of running a fixed pipeline every time. The coordinator analyses the request, matches its needs against each subagent description, and spawns only the relevant specialists, scaling from one subagent for a simple fact to ten or more for genuinely complex research. Get this right and your system is fast and cheap on easy work and thorough on hard work; get it wrong and you pay the full pipeline's latency and token bill on every request.

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

Dynamic Subagent Selection: How a Coordinator Routes Each Query

What dynamic subagent selection actually means

Why a fixed full pipeline is the wrong default

How dynamic subagent selection works inside a coordinator

The subagent description is the routing signal

Scaling effort to query complexity

A worked example: a research coordinator triaging three queries

Common misconceptions

How this is tested on the exam

Key takeaways

People also ask

Watch and learn

References & primary sources

Master this concept with Archie