Claude Code Tool Selection

In short: Built-in tool selection is the judgement of matching a task to the right Claude Code tool or sequence. Content searches map to Grep, file discovery by name maps to Glob, a targeted change maps to Edit, and a full rewrite maps to Read plus Write when Edit cannot find a unique anchor.

What Claude Code tool selection really tests

Claude Code tool selection is the evaluative top of Task Statement 2.5: not recalling what each tool does, but judging which one a described situation calls for and defending that judgement against plausible alternatives. By the time the exam reaches this level it assumes you already know the individual tools; what it measures is whether you can read a messy, realistic requirement and route it correctly under time pressure, the way a working architect does dozens of times an hour.

The reason this deserves its own knowledge point is that real tasks rarely announce their tool. They arrive as goals, "clean up the deprecated helpers", "figure out why logging is inconsistent", "bump the timeout everywhere", and the skill is translating a goal into the underlying questions, each of which has a tool. That translation, done quickly and correctly, is the whole competency, and it is why this sits at the evaluate level of Bloom's taxonomy rather than remember or understand.

Tool selection scenario analysis: The evaluative skill of mapping a described task to the correct Claude Code built-in tool or sequence: content search to Grep, path matching to Glob, targeted change to Edit, full rewrite to Read plus Write.

The selection rubric

Underneath the judgement is a compact rubric you can apply to any single requirement. Name what the requirement is really asking for, and the tool is determined:

Search file contents, finding an error string, a function definition, a call site, a config key. This is Grep.
Discover files by name, finding tests, configs, or files by extension or naming convention. This is Glob.
Make a targeted change with a unique anchor, flipping a value, renaming in one place, fixing a line. This is Edit.
Replace a whole file, or change text that cannot be made unique, a restructure or a new file. This is Read plus Write.

The rubric is deliberately exhaustive across the built-in toolkit, so a requirement that does not obviously fit one row usually means you have not yet decomposed it far enough. Splitting it into smaller questions resolves the ambiguity, because each atomic question lands cleanly in exactly one row.

Decomposing a compound scenario

Most exam scenarios are compound, and the evaluation skill is to break the goal into atomic requirements before routing. A task like "remove a deprecated function and fix its tests" is really three requirements stacked together: find the callers (a content search, so Grep), find their tests (a path question shaped by the callers, so Glob), and apply the change (targeted edits, so Edit, with Read first). Each piece routes through the rubric independently, and only then do you assemble them into a sequence.

This decompose-then-route habit is what separates a confident answer from a guess. When you try to pick one tool for the whole compound task you either choose a tool that drops part of the work or freeze because no single tool fits. Splitting first means every option you evaluate is checked against a precise sub-question, and the distractors that try to make one tool do everything fall away.

content

search inside files = Grep

paths

find files by name = Glob

anchor

targeted change = Edit

Evaluating competing tool plans

At the evaluate level the question often hands you several complete tool plans and asks which is best, not whether a tool is valid in isolation, but which whole approach wins. The way through is to score each plan against two tests: does it cover every sub-requirement, and does it use the lightest correct tool for each. A plan that technically reaches the answer but over-reads, globbing everything and reading every file to compensate for a missing Grep, loses to a plan that searches precisely first. A plan that drops a sub-requirement loses outright, however elegant its remaining steps.

This is where the earlier knowledge points pay off. Knowing that Edit beats Write for small changes, that Grep searches contents while Glob searches paths, and that exploration should stay incremental gives you the criteria to rank plans rather than just validate them. Evaluation is comparative, and the comparisons are exactly the trade-offs the prior KPs taught.

Routing a requirement to a built-in tool

Loading diagram...

Decompose a compound goal into atomic requirements, then route each one through this rubric.

Applying the analysis

The exam rewards a deliberate evaluation over a snap guess, especially when several options are individually defensible. The worked example below ranks three full plans for one task.

Worked example

A scenario question asks: the team wants to find why a feature flag named ENABLE_BETA is read inconsistently across the app and then standardise every read. Three candidate plans are offered. Which is strongest?

Start by decomposing the goal into atomic requirements. There are two: first, find everywhere ENABLE_BETA is read, which is a content search; second, standardise those reads, which is a set of targeted changes. With the requirements named, you can score each plan honestly rather than reacting to whichever sounds busiest.

Plan one globs for files named after the flag, such as a pattern matching beta or config, then reads them. Score it against the first requirement and it fails: the flag is read inside arbitrary files whose names have nothing to do with it, so a path search misses most of the call sites. A plan that does not cover the core requirement loses immediately, no matter how tidy its steps look.

Plan two reads the entire source tree to be sure nothing is missed, then edits the reads it finds by eye. It covers the requirement but violates the lightest-correct-tool test badly: bulk-reading a whole app floods the context window, degrades the model, and is the exact over-read the incremental pattern warns against. It reaches the answer expensively and fragile.

Plan three Greps for ENABLE_BETA to find every read precisely, Reads only those files to see the surrounding code, then applies an Edit to each site with a unique anchor. It covers both requirements and uses the lightest correct tool at every step: Grep for the content search, Read only where Grep pointed, Edit for the targeted standardisation. It wins because evaluation is comparative, and against the rubric plan three covers the work with the least waste while the others either miss a requirement or pay too much for it.

Common misreadings to avoid

Both traps below come from routing a requirement to a tool whose job does not match it.

Misconception

To find where a feature flag is read across the app, Glob for files related to the flag, since Glob finds the relevant files.

What's actually true

Reads of a flag are text inside arbitrary files, which is a content search, so the tool is Grep. Glob only matches file paths by name and would miss every call site that lives in a file not named after the flag.

Misconception

When a scenario has several valid tools, any tool that eventually produces the answer is an acceptable choice.

What's actually true

At the evaluate level the question is which approach is best, not merely valid. A plan that over-reads or drops a sub-requirement loses to one that covers every requirement with the lightest correct tool, even if both eventually reach an answer.

A quick-reference mapping you can run in seconds

Under exam time the rubric has to be fast, so it helps to rehearse it as a set of reflexes rather than a lookup. A requirement about what is written inside files, a message, a symbol, or a call, fires Grep before you finish reading the sentence. A requirement about which files exist, by extension, directory, or naming convention, fires Glob. A small, isolatable change to a file you can anchor uniquely fires Edit. A whole-file replacement, a new file, or text too repeated to anchor fires Read plus Write. The point of rehearsing the mapping is that on the exam you will not have time to derive it; you want it to feel automatic so your attention goes to decomposing the scenario instead.

The mapping also tells you when a requirement is still too coarse to route. If a goal seems to fit two rows at once, that is the signal it contains more than one atomic requirement and needs splitting. Decomposition and routing are the same motion practised together: break the goal until each piece lands in exactly one row, then read the tool off the row.

How wrong choices are disguised

Evaluate-level distractors are rarely absurd; they are valid-looking tools applied to the wrong sub-question, and knowing their disguises is half the defence. The most common is the content-path swap, offering Glob for something that lives inside files, or Grep for a pure file-discovery task, dressed in language that sounds reasonable because both tools search. Another is the over-reader: a plan that reads an entire module or tree to be thorough, which looks careful but spends the context budget the incremental pattern protects. A third is the heavy-hammer change, reaching for Write to rewrite whole files when a targeted Edit would do, justified by an appeal to consistency.

Each disguise is defeated by the same two tests you score every plan against: does it cover every requirement, and does it use the lightest correct tool for each. The content-path swap fails coverage, because the wrong tool misses the real targets. The over-reader and the heavy hammer fail cost, because a lighter tool reaches the same result with less waste. Running both tests on every option turns a set of plausible answers into a ranked one, which is precisely what an evaluate-level question is asking you to produce.

How this shows up on the exam

This knowledge point is the capstone of Task Statement 2.5, and it appears as the hardest built-in tool questions in Scenario 4, Developer Productivity with Claude. Rather than asking what a tool does, these items give you a realistic goal and several complete approaches, and ask you to evaluate and justify the best. The official Anthropic Academy lesson on the text edit tool, embedded below, reinforces one half of the rubric, when a targeted edit is the right instrument versus a full rewrite, and the exam expects you to apply that same comparative judgement across Grep, Glob, Edit, and Read plus Write. Decompose the goal, route each piece through the rubric, and rank the plans by coverage and cost.

Check your understanding

A scenario asks you to evaluate approaches for locating every place an error message string 'Payment declined' is produced, then changing the wording. Which approach should you select and justify?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

Claude Code Tool Selection: A Scenario Analysis of Built-In Tools