When to Use tool_choice: auto, any, forced

In short: Choosing a tool_choice mode is a per-scenario judgment: use auto when a plain-text answer is acceptable, any when a tool call is mandatory but the right tool depends on the input, and a forced named tool when one specific step must run every time.

When to use tool choice as a design decision

Deciding when to use tool choice settings is an evaluation-level skill, not a recall one. By the time you face this knowledge point you already know what the three settings do; the question now is which one a given situation calls for, and why the alternatives quietly fail. That shift from "what is auto?" to "is auto the right call here?" is exactly the cognitive jump the Claude Certified Architect exam tests at this level, and it is where careful architects separate themselves from people who memorised a parameter list.

The reason this is a judgment call is that every mode is correct somewhere and wrong somewhere else. There is no globally safe default you can apply blindly. auto looks safe because it lets the model decide, but that very freedom is a liability when your downstream system cannot tolerate a plain-text answer. The discipline this page builds is reading a scenario for its hard constraints first, then letting those constraints select the mode. If you have not yet locked in what each value does, anchor that first with tool_choice configuration options; this page assumes that vocabulary and works one rung above it.

Selecting a tool_choice mode: A per-request design decision that maps a scenario's constraints onto one of three behaviours: let the model decide (auto), require some tool call while leaving the choice open (any), or require one specific named tool (forced).

The three scenarios that decide the mode

Anthropic's tool-use guidance and the exam blueprint both reduce this to three recurring situations, and each one points cleanly at a single mode. Internalise the situations, not the values, because the exam describes situations.

The first situation is general operation, where a natural-language reply is a perfectly good outcome some of the time. A support assistant that sometimes answers from knowledge and sometimes looks something up belongs here. The model should be free to call a tool or to talk, so auto is correct; forcing a tool would make the assistant call something even when a direct answer was better.

The second situation is guaranteed output where the specific tool depends on the input. You must end the turn with a structured tool call, but which tool applies is genuinely unknown until the model reads the content. A pipeline that must always extract data from an incoming document, where the document might be an invoice, a contract, or a receipt, lives here. You need a mandatory call but cannot pin the tool, so any is correct: it removes the option of a text-only answer while still letting the model route to the right extractor.

The third situation is a mandatory first step, where exactly one action must run before anything else, every single time. Extracting metadata at the start of a conversation, or logging a structured intake record before reasoning, are typical. Here the tool is known and non-negotiable, so a forced named tool is correct. The judgment that distinguishes this from the second situation is whether the tool is fixed (force it) or input-dependent (use any).

auto

general operation, text answers acceptable

any

mandatory call, tool depends on input

forced

one fixed tool, mandatory first step

A decision framework you can apply under exam pressure

Because the three situations map to three modes, you can compress the whole decision into two questions asked in order. First: must a tool be called on this turn at all? If a text answer is acceptable, stop, the answer is auto. Only if a call is mandatory do you ask the second question: is the exact tool known in advance? If yes, force it by name; if it depends on what the model reads, use any. Running the questions in that order prevents the most common error, which is jumping straight to a mode without first checking whether a text reply was actually allowed.

Two questions that select the tool_choice mode

Loading diagram...

Ask about necessity before specificity; the order is what keeps you from defaulting to auto by reflex.

A subtle but important refinement is that tool_choice is set per request, so a single conversation can switch modes turn by turn. A common production shape forces a specific intake tool on the opening turn to guarantee the mandatory first step, then drops back to auto for the rest of the dialogue so the assistant can converse freely. Recognising that the setting is a per-turn lever, not a session-wide switch, is often the exact insight a harder scenario question is probing.

Why auto is the wrong default when output must be guaranteed

The single most punished misjudgment for this knowledge point is selecting auto in a scenario that demands a guaranteed structured result. With auto, the model is permitted to answer in plain text whenever it judges a tool unnecessary, and that judgment is probabilistic. Most of the time it may call your extractor; occasionally it will return prose, and your parser downstream will choke on input it was never designed to read. The failure is intermittent, which makes it worse, because it survives a quick test and then breaks in production on an input that looked, to the model, answerable without a tool.

The fix is to make the call mandatory rather than hoping for it. Setting tool_choice to any removes the text-only escape hatch entirely, and pairing it with strict: true on the tool constrains the model's output to your JSON schema through grammar-constrained sampling, so the inputs are both present and correctly typed. That combination, a mandatory tool call plus strict schema enforcement, is how Anthropic's documentation describes getting reliable structured output, and it is the architecturally correct response to "the result must always be parseable." Choosing auto here is not a small stylistic preference; it is a correctness bug dressed up as a sensible default.

The cost of forcing, and when it is too high

Evaluation means weighing trade-offs, and forcing a tool is not free. When tool_choice is any or a forced specific tool, the API prefills the assistant message so a tool call is guaranteed, and a direct consequence is that the model will not emit any natural-language text before the tool_use block, even if you explicitly ask it to. If your user experience depends on Claude saying "Let me look that up for you" before acting, forcing silently removes that narration. The documented workaround is to keep auto and add a plain instruction in the user message steering the model toward the tool, which preserves the narration while still nudging behaviour.

There is also a hard compatibility limit that often decides the question for you: with extended thinking enabled, only auto and none are supported, and requesting any or a forced tool returns an error because the forced prefill conflicts with the thinking block. So in any scenario that pairs deliberate reasoning with tool use, forcing is simply off the table, and the correct evaluation is auto plus instruction. Two smaller costs round out the picture: the forcing modes add a few tool-use system-prompt tokens on every request, and changing tool_choice mid-conversation invalidates cached message blocks, though your tool definitions and system prompt stay cached. None of these costs change the right answer when guaranteed output is genuinely required, but they are exactly the kind of detail a well-built distractor will hinge on.

Worked example

A document-intake service must turn every uploaded file into a structured record, and the same agent later answers follow-up questions about the file in plain language.

Start by separating the two phases, because they have different constraints and therefore different modes. Phase one is the intake: a file arrives, its type is unknown (it could be an invoice, a contract, or a receipt), and the system must end the turn with a structured extraction, never a text summary. Apply the framework. Must a tool be called? Yes, a plain-text answer is unacceptable to the downstream store. Is the exact tool known up front? No, it depends on what the document turns out to be. Two questions, one answer: tool_choice set to any, ideally with strict: true on each extractor so the captured fields conform to your schema. The model is now forced to call one of the extractors and to pick the one that matches the content.

Now contrast a tempting wrong turn. A developer worried about reliability might force a single named extractor, reasoning that forcing is "more guaranteed" than any. But forcing one extractor pins the wrong tool the moment a contract arrives at an invoice extractor, producing confidently structured nonsense. The lesson is that any and forcing are not ranked by strength; they answer different questions, and here specificity is the enemy because the right tool is input-dependent.

Phase two is the follow-up conversation. The user asks, "Was this contract signed before June?" A natural-language answer is exactly what is wanted, and a tool call may not even be needed. Here the constraint inverts: a text reply is acceptable, so the second question never gets asked and the answer is auto. Because tool_choice is per request, the same agent runs any on intake and auto on follow-ups, switching the lever as the constraints change. If instead the intake had a single mandatory logging step that must always run first, such as recording a provenance stamp, you would force that one named tool on the opening turn and then return to auto, mirroring the workflow enforcement pattern of guaranteeing a step deterministically rather than hoping a description triggers it.

Misconceptions that cost exam points

Misconception

auto is the safe default, so when in doubt choose auto.

What's actually true

auto is only safe when a plain-text answer is an acceptable outcome. In any scenario that must end in a structured tool call, auto lets the model reply in text instead, which breaks downstream parsing. When output must be guaranteed, the correct choice is any (optionally with strict mode), not auto.

Misconception

To guarantee structured output reliably, force one specific tool, because forcing is stronger than any.

What's actually true

any and forcing answer different questions, not the same one at different strengths. Force a named tool only when exactly one fixed tool must run. When the correct tool depends on the input, forcing pins the wrong tool and produces structured but incorrect data; any guarantees a call while letting the model select the right tool.

Misconception

I can force a tool and still get Claude to explain itself in natural language first.

What's actually true

No. With any or a forced tool, the API prefills the assistant turn, so no text is emitted before the tool_use block, even if you ask. To keep narration, use auto plus an explicit instruction in the user message; this is also the only option when extended thinking is enabled, since forcing is unsupported there.

How this is tested on the Claude Certified Architect exam

This knowledge point sits in Domain 2, Tool Design and MCP Integration, which carries 18% of the exam, under task statement 2.3, distributing tools across agents and configuring tool choice. Because the Bloom level is evaluate, questions will not ask you to recite what any means; they will hand you a scenario with embedded constraints and ask which configuration is appropriate, with distractors built from the misconceptions above. Expect the strongest wrong answer to be auto, chosen because it sounds flexible, in a situation that actually requires a guaranteed call.

These scenarios surface most often in the multi-agent research system and developer-productivity contexts (exam scenarios 3 and 4), where structured tool output feeds another agent or a downstream pipeline that cannot tolerate prose. The reliable way to earn the point is to run the two-question framework explicitly: decide whether a tool call is mandatory, then whether the tool is fixed or input-dependent. Pair that with the trade-off awareness, that forcing removes narration and bars extended thinking, and you can defend your choice the way the exam expects, by naming the constraint that rules every other option out. When a question instead centres on whether a tool was selected correctly at all rather than which mode to set, route your reasoning back through tool descriptions as the selection mechanism and structured error response analysis.

Check your understanding

An onboarding agent must always call extract_user_profile as its very first action on every new conversation, before any other reasoning, and then continue as a normal conversational assistant for the rest of the session. Which tool_choice strategy best fits?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

When to Use tool choice: auto, any, or Forced Selection by Scenario

When to use tool choice as a design decision

The three scenarios that decide the mode

A decision framework you can apply under exam pressure

Why auto is the wrong default when output must be guaranteed

The cost of forcing, and when it is too high

Misconceptions that cost exam points

How this is tested on the Claude Certified Architect exam

People also ask

Watch and learn

References & primary sources

Master this concept with Archie