- In short
- A PostToolUse hook is a shell command Claude Code runs immediately after a tool finishes, before the model reads the result. It receives the tool response and can reshape it, so heterogeneous formats become one clean, consistent representation the model reasons over.
What a PostToolUse hook does
A PostToolUse hook is a shell command that Claude Code runs the instant a tool finishes, in the narrow window after the tool returns and before that result reaches the model for its next decision. Of all the lifecycle events you can attach to, this is the one built for cleaning up after tools rather than policing them. The handler is handed the tool name, the arguments that were sent (tool_input), and the value the tool produced (tool_response), and it gets to decide what representation the model finally sees.
That timing is the whole point. By the time a PostToolUse hook runs, the action is done and cannot be undone, so this event is the wrong place to enforce a rule. What it is excellent at is shaping data. When three different tools each answer a question in their own dialect, a normalisation step here collapses those dialects into one, and every downstream reasoning turn inherits the cleaner input.
- PostToolUse hook
- A lifecycle handler that executes after a tool completes successfully. It receives tool_input and tool_response and can transform the result before the model processes it, but it cannot prevent the tool from having run.
What the hook receives on standard input
When Claude Code triggers the event, it streams a small JSON payload to your command on standard input, and reading it well is most of the job. The payload names the tool that just ran, echoes the tool_input arguments it was called with, and carries the tool_response the tool produced. Having both the request and the result in hand matters, because you can normalise differently depending on what was asked: you might reformat a date column only when the query actually selected one, or annotate a result only when a particular argument was present.
Your command answers in kind. It can return structured output that hands the model a cleaned, canonical view of the result through fields such as additionalContext, and it runs once per matched call rather than once per session, so the normalisation applies uniformly to every result that tool ever returns. Because the contract is just JSON in and a decision out, the hook stays a small, ordinary program you can reason about in isolation from the agent around it. That isolation is what lets you change the normaliser confidently without worrying about side effects elsewhere in the loop.
Why heterogeneous tool output is a problem
Real agents call many tools, and those tools rarely agree on format. One service returns a Unix epoch like 1717718400; another returns 2024-06-07T00:00:00Z; a third returns the string June 7, 2024. One API encodes success as the integer 200, another as the boolean true, another as the word ok. None of this is wrong, but it is inconsistent, and inconsistency is exactly what trips up a model that has to compare results across calls.
Left untouched, that variety forces the model to do reconciliation work mid-reasoning: guessing whether 200 and ok mean the same thing, or whether two differently-formatted dates are actually the same moment. That guessing is probabilistic and occasionally wrong. Pushing the cleanup into a PostToolUse hook removes the guessing entirely, because the model never sees the messy versions in the first place.
How normalisation works in the lifecycle
Configure the hook against the tools whose output you want to standardise, and Claude Code invokes your command with the execution details on standard input. Your script parses tool_response, applies your canonical rules, and emits the cleaned representation. From the model's perspective, every one of those tools now speaks the same language: timestamps are ISO 8601, statuses are readable strings, currency is a consistent unit, and empty results have a predictable shape.
This is the same instinct as the broader tool result appending discipline, where you control precisely what goes back into the conversation. The difference is that a PostToolUse hook automates that control deterministically, so it happens identically on every call rather than depending on the model remembering to tidy up. Once you trust the hook, you can stop writing defensive prompt instructions that beg the model to handle inconsistent formats.
Formats worth standardising
A handful of format families account for most of the inconsistency a normaliser has to absorb, and naming them makes the design concrete. Temporal values are the worst offenders: epochs, locale strings, and varied ISO shapes should all collapse to a single ISO 8601 representation in UTC, so the model can order events without guessing whether two differently-written dates are the same moment. Status signals are next: integers, booleans, and bespoke words like ok, succeeded, or 2xx should map onto one small vocabulary, so success means the same token everywhere the model looks.
Beyond those, money and measurements benefit from a fixed unit and precision, so totals are directly comparable without conversion, and empty results deserve a predictable shape rather than sometimes a null, sometimes an empty list, and sometimes a missing field. Deciding these canonical rules once, and encoding them in the handler, gives the model a stable contract: whatever exotic shape a tool emits, the version it reasons over is always the same. That stability is what turns a noisy collection of independent tools into something an agent can treat as one coherent data source.
Normalisation and the wider context budget
There is a second-order benefit that connects this knowledge point to context management later in the exam. Raw tool output is often verbose, full of fields the model does not need, and a normaliser is a natural place to trim as well as reshape. By projecting each tool_response down to the canonical fields that matter, the handler reduces the tokens that re-enter the conversation on every subsequent turn, which keeps a long-running agent leaner and cheaper without the model having to manage that itself.
This is normalisation doing double duty: it makes the data consistent and it makes the data compact. Both effects push in the same direction, towards a model that spends its reasoning on the task rather than on parsing and reconciliation. Keeping that framing in mind helps you argue not just that a PostToolUse handler is correct, but that it is efficient, which is the kind of trade-off Domain 1 expects an architect to weigh.
Why you should not delegate this to the model
The most common exam trap for this knowledge point is expecting the model to normalise inconsistent formats on its own. It often can, which is precisely what makes the trap dangerous: it works in testing and fails intermittently in production when an unusual format slips through. A foundations-level architect is expected to recognise that reliability comes from moving deterministic, mechanical transformations out of the probabilistic model and into code that runs the same way every time.
There is a related cost argument. If the model has to reconcile formats itself, it spends tokens and reasoning budget on plumbing instead of on the actual task. A normalisation hook does that plumbing once, cheaply, in a shell command, and frees the model to reason about substance. This is why the official guidance frames hooks as a way to make certain behaviours happen reliably rather than leaving them to chance.
Worked example
A research agent aggregates incident data from three monitoring tools that each report timestamps and severity differently.
The agent calls pagerduty_incidents, datadog_alerts, and a legacy syslog_query tool to build one incident timeline. PagerDuty returns ISO 8601 timestamps and severity as critical; Datadog returns Unix epochs and severity as the integer 5; the legacy tool returns Jun 7 2024 14:30 and severity as P1. If the model receives all three raw, it has to infer that 5, P1, and critical describe the same urgency, and that three differently-shaped dates can be sorted on one axis.
You attach a PostToolUse hook matched to those three tools. For each tool_response, the script converts every timestamp to ISO 8601 in UTC and maps every severity onto one scale, so 5 and P1 both become critical. The model now receives a uniform stream: identical date shape, identical severity vocabulary. Sorting the timeline and grouping by severity becomes trivial and deterministic.
Notice what the hook did not do. It did not decide whether to fetch the data, and it could not have stopped any of the three calls. They had already run. Its single responsibility was to make the returned data consistent. That clean separation is what keeps the design easy to reason about: enforcement lives earlier in the lifecycle, normalisation lives here, and each tool stays simple.
There is a maintainability payoff too. When a fourth monitoring tool is added next quarter, the agent prompt does not change at all; you extend the normaliser to map the new tool's severity scale and timestamp shape onto the existing canonical ones, and every downstream behaviour keeps working untouched. The model never learns that a new data source appeared, because from its point of view nothing about the shape of the data changed. Centralising format decisions in one tested handler is exactly what makes that kind of change cheap, safe, and invisible to the rest of the agent.
Common misreadings to avoid
Misconception
If I write a detailed enough prompt, Claude will reliably convert mismatched timestamps and status codes itself.
What's actually true
Misconception
A PostToolUse hook can reject a tool result it does not like and stop the tool from taking effect.
What's actually true
How it shows up on the exam
Domain 1, Agentic Architecture and Orchestration, is the most heavily weighted domain at 27% of the exam, and Task Statement 1.5 asks you to apply Agent SDK hooks for tool call interception and data normalisation. Questions on this knowledge point typically describe an agent that behaves erratically because it is fed inconsistent tool output, then ask for the most reliable fix. The strong answer names a PostToolUse hook that normalises formats, and rejects the option that leans on a cleverer prompt.
A reliable way to choose the right hook under exam pressure is to ask one question: has the action already happened? If you are reacting to a result, you want PostToolUse. If you need to prevent or change an action, you want tool call interception with a PreToolUse hook instead. Keeping that before-versus-after distinction crisp is what separates a confident answer from a guess, and it sets up the hooks versus prompts decision framework you will apply throughout this task statement.
A multi-tool agent reads order data from four MCP servers that each format dates and currency differently. The model occasionally mis-sorts orders and miscompares totals. Which change most reliably fixes this without depending on the model behaving well?
People also ask
What is a PostToolUse hook in Claude Code?
When does a PostToolUse hook run?
Can a PostToolUse hook block a tool call?
Watch and learn
Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.
Introducing hooks
Why watch: Introduces hooks intercepting tool execution, the PostToolUse mechanism this KP uses for deterministic behaviour.
More videos for this concept
References & primary sources
Master this concept with Archie
Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.