- In short
- Subagent delegation spawns a separate Claude instance with its own context window to run a noisy investigation, so the verbose file reads and tool output stay inside the subagent and only a condensed summary returns to the main agent. The main agent keeps a clean window for high-level coordination.
What subagent delegation does for context
Subagent delegation is the context-management move you reach for when the threat is not merely a long session but a noisy one. Some investigations are inherently verbose: reading dozens of files to understand a module, grepping a large repository, trawling logs. If that work runs in the main agent, every byte of it lands in the main window and accelerates the degradation you learned to spot in the prerequisite knowledge point. Delegation removes the noise at the source by running the investigation somewhere else entirely.
The mechanism is isolation. A subagent is, in Anthropic's words, a worker that "runs in its own context window with a custom system prompt, specific tool access, and independent permissions." When the main agent delegates, the subagent does the verbose work and "returns only its final output to the parent. All the intermediate noise, the file reads, the search results, the exploratory tool calls, stays inside the subagent's context and never touches the main conversation." The main agent receives a clean summary and keeps its window free for the coordination that only it can do.
- Context isolation
- Running a task in a separate context window so its intermediate output cannot pollute the caller's window. The subagent absorbs the noise; the main agent receives only the distilled result, preserving its own context for high-level reasoning.
Why isolation beats simply being careful
You could try to manage a noisy investigation in the main agent by trimming output and writing notes, and sometimes that is enough. But isolation is structurally stronger because it makes the noise impossible to leak rather than merely discouraged. Anthropic's documentation describes the canonical case directly: "when understanding how something works is a prerequisite to changing it, a subagent can explore the codebase and return a summary rather than dumping dozens of files into the conversation. The main conversation stays clean, and synthesized findings arrive instead of raw content."
There is a second benefit the exam cares about: parallelism. Because each subagent has its own window, independent investigations can run at the same time without interfering. Anthropic's guidance on context engineering notes that "specialized sub-agents can handle focused tasks with clean context windows" while returning condensed summaries of one to two thousand tokens, achieving a "clear separation of concerns" where the detailed search context stays isolated. Three subagents can map the auth, database, and API layers concurrently, and the main agent synthesises three tidy summaries instead of drowning in three verbose explorations.
How this composes with hub-and-spoke
Subagent delegation is the context-management face of the hub-and-spoke architecture you meet in Domain 1, which is why that knowledge point is a soft prerequisite here. The hub is the main agent, holding the goal and the clean coordinating context; the spokes are subagents, each given a focused brief and an isolated window. Domain 1 teaches the orchestration shape; Task 5.4 teaches why that shape is also a context-management technique. The same structure that lets you parallelise work also quarantines noise, and on the exam the context angle is the one Domain 5 questions test.
Delegation is not free, though, and the apply-level skill includes knowing its cost. Spawning a subagent has overhead, and because the subagent starts fresh it does not see the main conversation; the main agent must compose a clear delegation brief describing the task. That is the right trade when the investigation is large and noisy. It is the wrong trade for a quick lookup whose output is small, where the overhead of delegation exceeds the context it would have saved. Matching the tool to the size and noise of the task is exactly the judgement this knowledge point assesses.
Worked example
A main agent must add a feature that touches authentication, the data layer, and the public API, and first needs to understand all three without flooding its window.
Running all three investigations inline would be ruinous. Reading the auth module alone is twenty files; the data layer is a schema plus a dozen repositories; the API surface is forty route handlers. Done in the main agent, that is tens of thousands of tokens of file content sitting in the window for the rest of the session, crowding out the design work the main agent actually needs to do.
Instead the main agent delegates three investigations in parallel. The auth subagent reads its twenty files and returns: "Auth uses JWTs validated in TokenGuard; refresh tokens rotate; one gap, TokenGuard does not check audience claim." The data subagent returns a four-line summary of the repository pattern and the one table without a foreign-key constraint. The API subagent returns the routing convention and the two endpoints lacking auth middleware. The main agent now holds three crisp summaries, perhaps fifteen hundred tokens total, instead of tens of thousands of tokens of raw files, and it designs the feature against a clean window.
The exam trap is the inverse: running all three investigations in the main agent's context, exhausting it, and then trying to design the feature from a degraded window. The correct move is to push the verbose discovery into subagents and keep coordination central. Note too that delegation pairs naturally with scratchpad files: a subagent can persist its detailed findings to disk while still returning only a summary, so the depth is recoverable without ever entering the main window.
When delegation is the wrong call
Because delegation is powerful, the apply-level skill includes knowing when not to reach for it. A subagent is not free. Spawning one carries setup overhead, and because it starts with a fresh window it sees none of the main conversation, so the main agent must spend effort describing the task before any work happens. For a small, low-noise lookup, checking one constant, reading a single short file, the output that would have landed in the main window is tiny, and the overhead of delegation exceeds the context it saves. Here the right move is simply to do the work inline.
There is also a structural caution. Subagents do not nest arbitrarily well: a subagent that itself spawns subagents adds coordination cost and makes the final synthesis harder to trace. Anthropic positions subagents as the tool for side tasks that would otherwise flood the main window with output you will not reference again, which is a precise test you can apply. If the task's output is large and disposable, delegate it. If it is small, or if its detail needs to stay visible to the main reasoning, keep it inline. Delegating everything is as much a mis-selection as delegating nothing, the cost of the tool has to be justified by the noise it removes.
Writing an effective delegation brief
Because the subagent cannot see the main conversation, the quality of delegation rests on the brief the main agent writes. A weak brief produces a subagent that explores the wrong thing or returns the wrong shape of answer, wasting the very context the pattern was meant to save. A strong brief carries four things: the specific task ("map how the auth module validates tokens"), the scope boundary ("only src/auth, do not follow into shared utilities"), what to return ("the validation entry points and any gaps"), and the format ("a short summary plus a list, not raw file dumps").
That last element is what keeps the returned summary genuinely condensed. The whole benefit collapses if a subagent returns thousands of tokens of raw findings, so the brief should ask for a distilled result, the one-to-two-thousand-token summary Anthropic describes, and let the depth live in the subagent or in a scratchpad it writes. A good brief, in other words, specifies both the investigation and the compression. The main agent is not just offloading work; it is defining the interface across which only signal, not noise, is allowed to return.
A useful test of a brief is to ask whether the main agent could act on the expected return without ever seeing the files the subagent read. If the answer is yes, the interface is well defined and the isolation is real: the main window receives a decision-ready summary and nothing else. If the answer is no, if the main agent would need to re-read the source to make sense of the reply, then the brief under-specified what to return, and the noise it was meant to quarantine will leak back across the boundary anyway. Designing that interface deliberately is the part of delegation that separates a context win from a context wash.
Scoping a subagent's tools for least privilege
Isolation is the headline benefit of delegation, but a subagent's tool access is a second design lever the exam expects an architect to use deliberately. When you define a subagent, you can restrict it to a specific set of tools rather than handing it the main agent's full kit. Anthropic's subagent documentation frames this as both a focusing move and a safety one: a smaller toolset reduces the chance the subagent wanders into unintended actions, and it lets you apply least privilege, an investigation subagent that only needs to read and search has no business being able to write files or run destructive commands.
The mechanics fit the rest of the pattern. Subagents are defined up front and selected by their description, and they are invoked through the Agent tool, so the main agent's permissions decide whether delegation runs smoothly or stops for approval. Tool scoping is part of that same definition: you specify what the subagent is for and, separately, the narrow set of tools it is allowed to use. A clear description paired with a minimal toolset yields delegation that is predictable in both what it does and what it cannot do.
There is a context payoff too, which is why this belongs in a Domain 5 discussion and not only a security one. A read-only investigator cannot take noisy side actions, so its output stays closer to the clean, summarisable findings the main agent actually wants back. Scoping tools narrowly therefore reinforces the isolation benefit: the subagent absorbs the exploration, returns a tidy summary, and never had the means to generate the sprawling write-and-run noise that would have been hard to condense in the first place.
How this is tested on the exam
Task 5.4 questions present a scenario with a large or noisy investigation and ask how to keep the coordinating agent effective. When the defining feature is verbose output that would pollute the main window, subagent delegation is the answer, and the reason to give is context isolation: the noise stays in the subagent, only a summary returns. Strong answers also note that subagents enable parallel investigation and that the main agent keeps high-level coordination on a clean context.
The distractors typically tempt you to keep everything in one agent ("just trim the output," "summarise as you go") or to over-apply delegation to trivial lookups where the overhead is not worth it. Both miss the apply-level point: delegation earns its cost specifically when the investigation is large and noisy. Pair this with the knowledge that subagents do not see the main conversation, so a clear brief is required, and these questions become reliable points.
Misconception
A subagent shares the main agent's context window, so delegating still adds the investigation's output to the main conversation.
What's actually true
Misconception
Subagents are mainly a speed feature, so delegation does nothing for context management.
What's actually true
An agent must implement a feature spanning the auth module, the data layer, and the public API, and first needs to understand each. Reading all three inline would add tens of thousands of tokens of file content to the main window. What is the best context-management approach?
People also ask
How do subagents save context in Claude Code?
Do subagents share the main context window?
When should I use a subagent instead of the main agent?
Does the subagent see my earlier conversation?
Watch and learn
Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.
Using subagents effectively
Why watch: Pushing verbose investigation into subagents keeps the main agent's context clean, the delegation-for-context tactic this KP defines.
More videos for this concept
References & primary sources
- Claude Code Docs: Create custom subagentsPrimary source
- Claude Agent SDK: Subagents (definition and tool scoping)Primary source
- Anthropic Engineering: Effective context engineering for AI agentsDocs
- Anthropic Engineering: How we built our multi-agent research systemDocs
- Anthropic Academy: Using subagents effectivelyAcademy
Master this concept with Archie
Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.