- In short
- Error response scenario analysis is the evaluative skill of taking a described tool failure, determining its category and whether it is retryable, and selecting the correct recovery and customer communication. It is where the isError flag, the four categories, structured metadata, and the access-versus-empty distinction converge into one judgement.
What error response scenario analysis asks of you
Error response scenario analysis is the capstone skill of Task Statement 2.2: given a described failure in a realistic setting, evaluate it and choose the single correct response. It is graded at the evaluate level because there is rarely a fact to recall, instead you weigh the situation, decide what kind of failure it is, judge whether retrying could help, and select both a recovery strategy and the message the user should hear. Every earlier knowledge point in this task statement feeds into this one judgement.
The reason error response scenario analysis earns its own knowledge point is that the building blocks are easy to state and easy to misapply under pressure. Knowing the four categories does not guarantee you will spot that a "customer not found" during an outage is really an access failure, or that a refund refusal is a business error rather than something to retry. The skill is the disciplined application of the parts to a concrete case.
A decision procedure you can run every time
The most reliable way to handle these items is to run the same short procedure rather than reacting to surface wording. First, ask what actually failed: did the tool reach its data source at all? If not, you may be looking at an access failure masquerading as an empty result. Second, classify the failure into one of the four categories. Third, read off retryability from the category: transient is retryable, validation is retryable only after correcting the input, business and permission are not. Fourth, choose the recovery the category implies. Fifth, decide what the user should be told.
Running this procedure turns a tricky judgement into a checklist. The wrong answers in scenario questions are usually plausible reactions, retry, escalate, apologise, that are correct for some category but not the one in front of you. Walking the steps keeps you from grabbing the recovery that merely feels right.
Communication is part of the answer
What distinguishes this knowledge point from pure classification is that the message matters as much as the mechanism. For non-retryable failures especially, the agent owes the user a clear, customer-friendly explanation. Anthropic's guidance to write instructive error text applies here at the human layer: a refund that policy forbids should produce "Refunds over $500 need a supervisor's approval, and I have flagged this for one," not a silent retry loop or an opaque "request failed."
Good communication also prevents a subtler failure: implying that retrying might help when it cannot. Telling a customer "let me try that again" on a business error sets a false expectation and erodes trust when the same refusal returns. Scenario analysis is judged partly on whether the chosen response tells the user the truth about what is and is not possible.
Where the earlier knowledge points converge
This page is deliberately the meeting point of the whole task statement. The isError flag is what made the failure visible in the first place. The four categories are the vocabulary you classify with. Structured metadata, errorCategory, isRetryable, a description, is the form that makes the classification reliable rather than guessed. The access-failure-versus-empty-result distinction is the trap you check for before you even classify. Scenario analysis is simply using all of them together, fast and correctly, on a case you have not seen before.
That convergence is also why the exam saves the hardest items for here. A single stem can require you to notice a disguised access failure, classify the underlying error, reject an inviting retry, and pick a customer message, four decisions that each map back to a different prerequisite knowledge point.
Worked example
A support agent is handling a customer who wants a $1,200 refund. The agent calls issue_refund, which returns a failure: errorCategory business, isRetryable false, description 'Refunds above $500 require supervisor approval.'
Walk the procedure. First, did the tool reach its data source? Yes, the refund service ran and applied a rule, so this is not a disguised access failure. Second, classify it: the metadata says business, and the description confirms a policy deliberately refused the amount. Third, retryability: isRetryable is false, and that is consistent with a business error, so retrying the identical $1,200 refund is off the table. Fourth, recovery: business errors need an alternative workflow, so the agent routes the request to a supervisor who can approve refunds over the limit. Fifth, communication: the agent tells the customer plainly, "Refunds above $500 need a supervisor's sign-off. I have sent this to one and they will follow up shortly."
Now consider the tempting wrong moves the procedure rules out. Retrying the refund would just re-trigger the same rule. Apologising and dropping it would abandon a customer who has a legitimate path forward. Escalating as if it were a permission problem would misroute it to access provisioning rather than refund approval. Only the business-error path, reroute plus an honest, specific message, survives the five questions. That is error response scenario analysis: not picking a reaction, but reasoning to the one response the situation actually warrants.
Common misreadings to avoid
Misconception
When in doubt on a scenario question, choosing 'retry the call' is the safe default.
What's actually true
Misconception
Once you have chosen the right recovery action, the wording of the user-facing message is a separate, lower-stakes concern.
What's actually true
Read the scenario for the cause, not the symptom
The most common way to miss these items is to react to the symptom in the stem rather than diagnosing the cause beneath it. "The call failed" is a symptom shared by every category; "the agent retried and escalated" is a behaviour that can be right or wrong depending on what failed. The evaluative move is to look past what happened on the surface and ask why it happened: was a service briefly down, was the input malformed, did a rule refuse the request, was access missing, or did the tool never reach its source at all?
Anchoring on cause also immunises you against emotionally loaded phrasing. A stem may stress that a customer is "frustrated" or that a deadline is "urgent," nudging you toward a hasty retry or escalation. Urgency does not change a failure's category. A business error is still non-retryable when the customer is impatient; an access failure still needs an honest message when the clock is ticking. Evaluate the cause coldly first, then let empathy shape the wording of the response, not the choice of recovery.
Classify by type, and capture the request ID for escalation
Two practical details separate a clean scenario answer from a sloppy one. The first is that the canonical classifier is the error type, not the raw HTTP status. Anthropic's API returns errors as JSON with an error.type field, and that string (rate_limit_error, permission_error, overloaded_error, and so on) is what you actually reason from. Several stems lean on this: a status code alone can be ambiguous, but the type tells you whether you are looking at a transient overload to back off from or a permission wall to escalate. Read the type, then decide the recovery.
The second detail belongs to the recovery half of the answer, and it matters most when escalation is the right move. When a scenario routes a failure to a human or a support queue (an internal 500 api_error, a persistent 529 overloaded_error, or a billing block the agent cannot clear), the response should capture the request identifier so the failure can be traced afterwards. Anthropic returns a request_id in the error body and a request-id header on every response; on Claude running on AWS, responses also carry an x-amzn-requestid. An escalation that names the request ID is actionable, while one that says only "it failed" forces whoever picks it up to start from nothing.
On the exam this surfaces as the gap between a response that merely picks the right action and one that also closes the loop. The strongest answer classifies by type, respects retryability, routes to the correct workflow, and, when it hands the problem onward, carries the identifier that lets the next responder resume exactly where the agent left off.
The distractors are recoveries for the wrong category
A structural insight that makes these questions far easier: in a well-built scenario item, the wrong options are usually correct recoveries for the wrong category. One distractor retries (right for transient, wrong here), another escalates as permission (right for permission, wrong here), another reports an empty result (right for a true no-match, wrong here). Each is a plausible action lifted from a different branch of the decision procedure. Recognising this turns the question into a matching exercise: identify the actual category, then find the option whose recovery matches it.
This is why running the procedure beats pattern-matching on keywords. If you classify first and only then scan the options, the distractors lose their pull, because you already know which category's recovery you are looking for. Skip the classification and every option looks individually reasonable, which is exactly the confusion the item is engineered to create.
Edge cases the capstone loves
Because this is the evaluate-level apex of the task statement, it gravitates toward the trickiest combinations. Expect the disguised access failure, a tool reporting "no records" during an outage, where the correct first move is to recognise that the source was never reached, before any classification of the underlying error. Expect the exhausted-retry case, where a genuinely transient failure has already consumed its retry budget and the right answer has shifted from "retry" to "escalate," even though the category is still transient. Expect the ambiguous 429, where you must judge whether it is a momentary spike (transient) or an exhausted quota (effectively a limit that retrying will not clear).
Each of these rewards the same habit: do not stop at naming the category, also check whether retryability still applies in this state. A transient failure that has burned its budget no longer warrants another attempt; an access failure must be unmasked before it can be classified at all. The capstone is testing whether you can hold several prerequisite ideas at once and apply them to one unfamiliar case without dropping any of them.
Tying recovery to communication
Finally, remember that at this level the response is judged as a whole, recovery and message together. Choosing the correct action but pairing it with a misleading message is still a wrong answer. A non-retryable failure must be communicated as such, clearly, specifically, and without implying that another attempt might help. The best responses name what cannot be done, state what will happen instead, and do so in language a customer can act on.
That coupling is the difference between a technically correct agent and a trustworthy one. The exam, like a real support desk, rewards the response that both does the right thing and tells the user the truth about it. Treat the message as part of the answer, not a flourish added after the real decision is made.
How this is tested
These are the toughest items in Domain 2 because they combine several decisions into one. A stem will describe a failure with enough texture to tempt at least two plausible recoveries, and your task is to evaluate which is actually correct for the category in play. Expect business-error scenarios that bait a retry, access failures dressed as empty results, and permission errors that look transient. The scoring rewards the response that classifies correctly, respects retryability, routes to the right workflow, and communicates honestly, the same five-step procedure, applied to an unfamiliar case.
A customer asks an agent to change the email on an account. The agent calls update_email, which returns: errorCategory permission, isRetryable false, description 'This action requires account-owner verification, which has not been completed.' The customer is impatient. What is the best response?
People also ask
How do you analyse an error-handling scenario?
What recovery does a business error need in a scenario question?
How should non-retryable errors be communicated?
Watch and learn
Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.
Claude Agent SDK [Full Workshop]: Thariq Shihipar, Anthropic
Why watch: Demonstrates how tool errors are passed back to Claude and how to handle them differently per situation, the practical skill behind choosing the right recovery strategy for a given error scenario.
More videos for this concept
References & primary sources
Master this concept with Archie
Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.