- In short
- Retry with error feedback is a self-correction pattern: when a structured extraction fails validation, you send Claude three things together, the original source document, the specific output it produced, and the exact validation error, and ask it to try again. Supplying the concrete error is what makes the retry productive, because the model can target the field that broke instead of guessing blindly.
What the retry with error feedback pattern is
Retry with error feedback is the disciplined way to recover from a structured extraction that fails validation. Instead of throwing the result away and calling the model again with the same prompt, you assemble a corrective follow-up turn that carries the evidence of what went wrong. Claude reads that evidence, locates the field that broke, and produces a corrected extraction. The pattern turns a one-shot extraction into a short self-correction conversation.
This sits inside Task Statement 4.4 of the Claude Certified Architect exam, which covers validation, retry, and feedback loops for extraction quality. It assumes you already enforce structured output with tool use and JSON schemas, because a schema is what produces the precise, machine-readable validation error that the retry depends on.
- Retry with error feedback
- A recovery pattern in which a failed structured extraction is sent back to the model alongside the original source document and the exact validation error, so the model can self-correct the specific failure on its next attempt.
Why a naive retry usually fails
The instinct when an extraction comes back malformed is to simply call the API again. The problem is that nothing has changed. The prompt is identical, the document is identical, and the model is, for practical purposes, walking into the same situation that produced the failure the first time. Re-rolling the dice occasionally lands on a valid answer, but it is unreliable and wasteful, and on a difficult field it can loop indefinitely without converging.
A retry becomes productive only when the second attempt has information the first attempt lacked. That information is the error itself. When you tell the model exactly which field violated the schema, which value was the wrong type, or which required key was missing, you hand it a precise target. The generation is no longer a fresh guess; it is a correction anchored to a concrete defect.
The three things you send back
The pattern is defined by what the corrective turn contains. Omit any of the three and the retry weakens.
The original source document
The model needs the underlying facts in context to extract them correctly. If you send only the error and the bad output, Claude has nothing authoritative to re-read, so it may invent a plausible-looking value rather than recover the real one. Keeping the source document in the retry turn grounds the correction in the actual text.
The failed extraction
Showing the model its own previous output gives it a starting point to revise rather than a blank page. It can see the structure it nearly got right and change only the part that broke. This is why a focused retry is usually faster to converge than a full regeneration: most of the work was already correct.
The specific validation error
This is the active ingredient. A vague instruction like "the output was invalid, try again" carries almost no signal. A precise message, that invoice_total was returned as a string when the schema requires a number, or that the required currency field was absent, tells the model precisely what to fix. The more exact the error, the more surgical the correction.
How the retry with error feedback loop works
In practice the pattern is a tight loop wrapped around your validation step. You extract, validate, and on failure construct a corrective message before calling the model again. Anthropic describes the general shape of this in its evaluator-optimizer workflow, where one step generates a result and a second step evaluates it and feeds the critique back for refinement. Here, your schema validator plays the evaluator role and the error message is the feedback.
The important detail is that the loop is closed by your code, not by the model. Claude does not know its output failed until you tell it. Your application owns the validator, owns the construction of the corrective message, and owns the decision to call again. The model supplies only the corrected attempt.
Carrying the error back as a tool_result
Because this pattern builds on tool use with JSON schemas, the corrective turn usually rides inside the tool-use conversation rather than as a free-floating new prompt. When Claude's extraction arrives as a tool_use block and your validator rejects it, you reply with a user message containing a tool_result block. That block references the original tool_use_id, places the failure description in its content, and sets is_error to true. Anthropic's tool-use documentation requires the tool_result block to immediately follow its matching tool_use block and to appear first in the content array, so the model reads the failure in exactly the slot where it expects a result. This is not an exotic add-on to the API; it is the documented path for signalling that an attempt did not work, and Anthropic notes that for invalid tool inputs Claude will retry two to three times with corrections before giving up. The error-feedback pattern simply makes that built-in self-correction deliberate and schema-driven instead of incidental.
Writing an error message Claude can act on
The success rate of the loop rides almost entirely on the quality of the message you put in that error field. Anthropic's guidance is explicit: write instructive error messages, not generic ones. The docs contrast a useless "failed" with a message that states what went wrong and what to try next, because the specific text is what gives the model the context to recover without guessing. Translate your validator's complaint into a precise, human-readable sentence that names the field, the constraint it violated, and the shape that would satisfy it. currency must be a three-letter ISO code; received "US dollars" is actionable; invalid output is not. The lever that moves accuracy is the precision of this message, not the number of times you are willing to retry. An architect who understands that will tune the error text first and the retry count last.
Validation retries versus transport retries
Not every retry against the Claude API is the same kind of retry, and conflating them is a common architectural mistake. The pattern on this page is a content or validation retry: the previous call returned a well-formed response that was wrong about the data, so the corrective turn must add new information, the specific validation error, for the next attempt to improve. A transport retry is a different animal. It handles infrastructure failures such as a 429 rate-limit response, an overloaded 529, or a dropped connection, where the request never produced a usable answer at all. Transport retries are resolved by waiting and reissuing the identical request under exponential backoff, and they deliberately add no error feedback, because there is nothing about the content to correct.
A third case sits between the two and deserves to be separated out. When a response is cut off because it hit max_tokens, the model was truncated rather than logically wrong, so the right move is to reissue with a higher max_tokens instead of sending a validation critique. Anthropic's stop-reason guidance is explicit that you should inspect the stop_reason before deciding how to react, because a normal stop is not an API failure and should not be retried blindly. Reading the stop reason first tells you which of the three retries you are actually in: feed back the validation error, back off and reissue unchanged, or raise the token ceiling. Only the first of those is the error-feedback pattern; mistaking a truncation or a rate limit for a content error is how engineers end up sending corrective turns that do nothing.
Where this fits on the exam
Domain 4 is weighted at 20% of the exam, and validation-and-retry questions cluster around the structured data extraction scenario. The exam rarely asks you to recite the pattern; it presents a developer whose retries keep failing and asks you to diagnose why. The answer is almost always that the corrective turn is missing the specific error, so each attempt is effectively a fresh blind guess. Recognising that the error message is the thing that makes a retry work, and that the document and prior output must travel with it, is the assessable insight here.
This knowledge point sits at Bloom's understand level, and the exam respects that boundary. You are asked to explain the mechanism, why feedback turns a futile retry into a productive one, rather than to design a complete loop or to judge when a retry is hopeless. Those are deliberately separated into the apply and analyse knowledge points layered on top: designing the validation loop is an application task, and knowing the retry effectiveness boundary is an analysis task. Keeping straight what 4.4.1 actually claims, feedback makes correction possible, stops you from over-reaching into those neighbouring questions and picking an answer that is true but out of scope.
This knowledge point is deliberately the root of its cluster. Once you understand productive retries, you can reason about when a retry can and cannot help, about cross-validation that catches errors a schema cannot, and about designing the complete loop with sane termination.
Worked example
A finance team extracts invoices with a tool_use schema. One invoice comes back with invoice_total set to the string '1,240.00' when the schema requires a number, so validation rejects it.
A blind retry would resend the same prompt and the same invoice image, and Claude would likely return the same comma-formatted string, because nothing told it the format was the problem.
Instead, you build a corrective turn. You include the original invoice so the figures are still in context, the failed extraction object so the model can see what it produced, and the exact validator message: "invoice_total must be a number; received a string with a thousands separator." Now the model has a precise target. It re-reads the invoice, strips the separator, and returns 1240.00 as a numeric value that passes the schema on the second attempt.
Notice what your code did between calls. It ran the validator, identified the offending field, wrote a human-readable explanation of the violation, and packaged all three inputs into the next message. The model never saw the failure on its own. Your loop surfaced it. That division of labour, where your application detects and explains and the model corrects, is the entire pattern.
Worked example
The same finance pipeline extracts a payment_status field that the schema constrains, via an enum, to exactly one of paid, unpaid, or overdue. For one invoice Claude returns 'Past Due', semantically right, but it fails the enum check.
A blind retry would resend the identical prompt and very likely return another near-synonym, "Late", or "Overdue Balance", none of which match the allowed set, so the loop would stall without converging.
The corrective turn instead carries the invoice, the rejected object, and a precise error: payment_status must be exactly one of paid, unpaid, or overdue; received "Past Due". With the legal values spelled out in the error, Claude maps its own interpretation onto the closest allowed token and returns "overdue", which passes on the next attempt. The correction worked because the message named the exact constraint the value had to satisfy, not merely that something was wrong.
This is a structural, encoding-level miss: the information was present in the document and the model understood it correctly, but it serialised the answer in a form the schema would not accept. Surfacing the permitted values is what closed the gap, and it illustrates why the enum constraint and the error text have to be carried together for the retry to land.
Edge cases and design details
A few practical details separate a retry loop that works from one that merely looks like it should.
- Bound the number of attempts. The loop has to terminate, so you cap the retries rather than spinning on a stubborn document forever. The full design of sane termination conditions belongs to validation loop design; for this knowledge point, the assessable fact is just that the loop is yours to close, because the API will never stop it for you.
- Treat a repeated identical error as a signal. If the same error survives a well-formed corrective turn, that is evidence the fix may be out of reach, the value could be genuinely absent from the source rather than mis-formatted. Distinguishing repairable failures from hopeless ones is the job of the retry effectiveness boundary, the analyse-level knowledge point directly above this one.
- Send the rejected object, not just the error. Giving Claude its prior attempt lets it revise a structure that was nearly correct instead of regenerating from a blank page. That converges faster and changes fewer fields, which matters when only one field out of a dozen actually broke.
- Keep the validator's message machine-precise. Schema tools such as JSON Schema and Pydantic already emit field-level errors that name the path and the violated rule. Surface those verbatim, or lightly rephrased, rather than flattening everything into a single "validation failed", the detail you discard is exactly the detail the model needs.
Common misconceptions to avoid
Misconception
If an extraction fails validation, I should just call the model again with the same prompt until it succeeds.
What's actually true
Misconception
It is enough to tell the model the output was invalid and ask it to try again.
What's actually true
Misconception
To save tokens on the retry, I can drop the original document and just resend the bad output and the error.
What's actually true
How it shows up on the exam
Expect a scenario where structured extraction is already working most of the time, but a developer is frustrated that their retry logic does not improve the failure cases. The distractors will offer tempting but wrong levers: raising the temperature, increasing max_tokens, or adding more few-shot examples. The correct answer ties the failure to the missing feedback: the retry resends nothing new, so the model cannot learn from the previous attempt. Keep your eye on what changes between the first call and the second.
A developer extracts structured data with a tool_use schema. When an extraction fails validation, their code immediately re-calls Claude with the identical prompt and document, but the same field keeps failing. What is the most effective change?
People also ask
How do you make an LLM correct its own mistakes?
What should you include in a retry prompt?
Is retrying an LLM call better than re-prompting from scratch?
Watch and learn
Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.
How to Master Error Handling in Claude APIs
Why watch: Covers handling and feeding back Claude API errors, directly modelling the pattern of returning a specific validation error so the model can self-correct on retry.
More videos for this concept
References & primary sources
- Anthropic Docs: Reduce hallucinations (iterative refinement)Primary source
- Anthropic Engineering: Building effective agents (evaluator-optimizer)Docs
- Anthropic Docs: Tool use overviewDocs
- Anthropic Docs: Handle tool calls (errors with is_error)Docs
- Anthropic Docs: Handling stop reasons (max_tokens, stop_reason)Primary source
Master this concept with Archie
Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.