Concept deep dive·10 min read·28 June 2026

Claude Van Damme: XML Patterns for Strict Structured Output

Master the "claude van damme" XML-tag approach to enforce strict structured output in Claude agents, separating reasoning from final answers without hallucinated keys.

By Solomon Udoh · AI Architect & Certification Lead

Claude Van Damme: XML Patterns for Strict Structured Output

The phrase "claude van damme" has become shorthand in the CCA-F community for a specific discipline: hitting your structured output target with the precision of a roundhouse kick, every single time. No hallucinated keys, no stray prose wrapping your JSON, no reasoning leaking into your answer field. This post is a practical engineering guide to that discipline, mapped directly to Domain 4 (Prompt Engineering & Structured Output, 20% of the exam) and Domain 1 (Agentic Architecture & Orchestration, 27%).

We will cover XML-tag separation patterns, few-shot example structure, anti-laziness prompting, explicit uncertainty allowances, and verification loops. Every technique here is testable in a real Claude agent today.

Why does structured output fail in Claude agents?

Structured output fails for three distinct reasons, and conflating them leads to the wrong fix.

Schema confusion. The model does not know which keys are required, which are optional, and what types they expect. It guesses, and guesses wrong.
Reasoning bleed. The model's chain-of-thought leaks into the output field, wrapping your JSON in prose or adding explanatory text after the closing brace.
Hallucination under uncertainty. When the model does not know a value, it invents one rather than emitting a permitted null or an explicit "I don't know" sentinel.

Each failure mode has a different root cause and a different fix. Treating all three as a single "prompt harder" problem is a narrow decomposition failure that the exam tests directly.

What is the XML-tag separation pattern and why does it work?

The core technique is to give the model two explicitly named regions: a scratchpad for reasoning and a fenced output block for the final answer. Claude's training makes it highly responsive to XML-style delimiters, so the separation is reliable when the tags are consistent and the system prompt defines them unambiguously.

A minimal system prompt looks like this:

text

You are a structured-output agent. When responding, always use the following format:

<thinking>
[Your reasoning, analysis, and intermediate steps. This section is discarded before delivery.]
</thinking>

<answer>
[Your final response, which must be valid JSON conforming to the schema below. No prose outside the JSON object.]
</answer>

Schema:
{
  "summary": "string",
  "confidence": "high | medium | low",
  "sources": ["string"],
  "unknown_fields": ["string"]
}

The <thinking> block gives the model a legitimate place to reason without contaminating the output. The <answer> block is then parsed programmatically by stripping everything outside it. This is not a workaround; it is the pattern Anthropic's own documentation recommends for separating chain-of-thought from final output.

Use XML tags to structure your prompts. Claude has been trained to pay close attention to XML tags and will use them to understand the structure of your request.

Anthropic , Claude Documentation (Prompt Engineering Guide)

The exam rewards this approach because it is deterministic: the parser always knows where to look, regardless of what the model wrote in <thinking>.

How do few-shot examples reinforce the XML pattern?

Few-shot examples are the highest-leverage technique for generalising a reasoning pattern across diverse inputs. For structured output, each example must itself be correctly formatted, or the model learns the wrong template.

A well-structured few-shot block for the schema above looks like this:

text

### Example 1

User: Summarise the Q1 revenue report. Sources: [report_2026_q1.pdf]

<thinking>
The report shows revenue of $4.2M, up 12% YoY. Confidence is high because the figure appears in the audited section. No unknown fields.
</thinking>

<answer>
{
  "summary": "Q1 revenue was $4.2M, a 12% year-on-year increase.",
  "confidence": "high",
  "sources": ["report_2026_q1.pdf"],
  "unknown_fields": []
}
</answer>

### Example 2

User: Summarise the market share data. Sources: []

<thinking>
No sources were provided and no market share data is in context. I cannot fabricate a figure. I should emit an unknown_fields entry.
</thinking>

<answer>
{
  "summary": "Insufficient data to summarise market share.",
  "confidence": "low",
  "sources": [],
  "unknown_fields": ["market_share"]
}
</answer>

Notice that Example 2 explicitly models the "I don't know" path. This is critical. Without a demonstrated uncertainty example, the model defaults to hallucination under low-confidence conditions. The unknown_fields array is a schema-level permission to say "I don't know" without breaking the contract.

For prompt engineering and structured output on the CCA-F exam, expect scenario questions that ask which few-shot structure most reliably generalises. The answer is almost always the one that includes a negative or uncertain example alongside the positive case.

How do you prevent reasoning bleed in MCP-integrated workflows?

In MCP-integrated workflows, the model may receive tool results mid-conversation and need to incorporate them before emitting a final answer. This creates a second bleed risk: tool-result prose appearing in the <answer> block.

The fix is a two-stage prompt chain:

Stage 1 (tool invocation): The model reasons in <thinking> and emits tool calls. No <answer> block is expected yet.
Stage 2 (synthesis): After tool results are appended, a fresh user turn instructs the model to produce the <answer> block only, with no further tool calls.

python

# Stage 1: reasoning + tool calls
stage_1_messages = [
    {"role": "user", "content": "Analyse the sales data. Use the fetch_report tool."}
]

# ... run the agentic loop, append tool results ...

# Stage 2: force structured output
stage_2_messages = stage_1_messages + tool_result_turns + [
    {
        "role": "user",
        "content": (
            "Now produce your final <answer> block only. "
            "Do not call any tools. Conform strictly to the schema."
        )
    }
]

This pattern maps directly to fixed sequential pipelines (prompt chaining). The key insight is that the synthesis step is a separate prompt, not a continuation of the tool-use loop. Mixing them is an agentic loop anti-pattern that the exam tests under Domain 1.

What system prompt role assignments improve format adherence?

Role assignment in the system prompt is not decoration. It sets the model's behavioural prior for tone, verbosity, and format strictness. For structured output agents, the role should specify three things: the agent's function, its output contract, and its failure mode.

Role element	Weak version	Strong version
Function	"You are a helpful assistant."	"You are a data extraction agent that returns only valid JSON."
Output contract	"Return JSON when asked."	"Every response must contain exactly one `<answer>` block with valid JSON. No prose outside it."
Failure mode	(omitted)	"If a required field cannot be determined, set it to null and add the field name to `unknown_fields`."

The strong version eliminates ambiguity at the schema level. The model does not need to infer what "valid JSON" means in context; it is told exactly what to do when it does not know a value.

Domain 4 of the CCA-F exam covers this directly. Per Anthropic's exam guide, the domain tests candidates on "prompt engineering techniques that improve output reliability," which includes role assignment, schema specification, and uncertainty handling.

How do you tune anti-laziness prompting without over-triggering tool calls?

"Laziness" in this context means the model producing a truncated or superficial answer to avoid the computational cost of a full response. The naive fix is to add "be thorough" or "do not skip steps" to the system prompt, but this often causes the opposite problem: the model over-triggers tool calls to demonstrate effort, burning tokens and latency.

The calibrated approach has three components:

Specify completeness at the field level, not the response level. Instead of "be thorough," say "the sources array must contain every document referenced in your reasoning."
Set a minimum field count where appropriate. "The summary field must be at least two sentences" is more precise than "do not truncate."
Prohibit tool calls in the synthesis stage explicitly. As shown in the Stage 2 prompt above, "Do not call any tools" prevents effort-signalling via unnecessary tool use.

This is a prompt-based vs programmatic enforcement trade-off. For low-stakes completeness requirements, prompt-level instructions are sufficient. For high-stakes workflows where truncation has real consequences, add a programmatic validator that checks field lengths and required keys before accepting the response.

Which verification patterns catch errors in autonomous agent tasks?

Verification in autonomous agents is not optional; it is the mechanism that converts probabilistic output into reliable production behaviour. The CCA-F exam consistently rewards deterministic solutions over probabilistic ones when stakes are high.

Three verification patterns are worth knowing:

Pattern	How it works	Best for
Schema validation	Parse the `<answer>` block against a JSON schema; reject and retry with error feedback if invalid	Catching missing keys, wrong types
Self-check prompt	After synthesis, send a second prompt: "Does your answer conform to the schema? List any violations."	Catching semantic errors the schema cannot express
Cross-instance review	Route the output to a second Claude instance with only the schema and the output; ask it to flag violations	High-stakes workflows where a single instance may be blind to its own errors

The retry-with-error-feedback pattern is the most broadly applicable. When the schema validator rejects a response, the error message is appended to the conversation and the model is asked to correct only the failing fields:

python

def validated_structured_output(client, messages, schema, max_retries=3):
    for attempt in range(max_retries):
        response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=1024,
            messages=messages
        )
        answer_text = extract_answer_block(response.content[0].text)
        try:
            validated = schema.validate(answer_text)
            return validated
        except ValidationError as e:
            messages.append({
                "role": "assistant",
                "content": response.content[0].text
            })
            messages.append({
                "role": "user",
                "content": (
                    f"Your answer failed schema validation: {e}. "
                    "Correct only the failing fields and re-emit the full <answer> block."
                )
            })
    raise MaxRetriesExceeded("Structured output could not be validated.")

This loop is deterministic in its termination (bounded retries), proportionate in its fix (correct only failing fields), and traces the root cause (the validation error is passed back verbatim). All three properties align with how the exam scores agentic reliability questions.

How does the CCA-F exam weight these techniques?

Domain 4 (Prompt Engineering & Structured Output) carries 20% of the exam weight, and Domain 1 (Agentic Architecture & Orchestration) carries 27%. Together they account for nearly half the exam. The techniques in this post span both domains.

Technique	Primary domain	Exam weight
XML-tag separation	Domain 4	20%
Few-shot example structure	Domain 4	20%
Prompt chaining for synthesis	Domain 1	27%
Role assignment and schema spec	Domain 4	20%
Anti-laziness field-level prompting	Domain 4	20%
Validation loop design	Domain 1	27%

The five exam domains and their weights are published in Anthropic's official exam guide. As of 3 June 2026, more than 10,000 individuals have passed the CCA-F, which launched on 12 March 2026 at $99 per attempt.

Our concept library at /concepts covers 174 atomic concepts mapped to all five domains, including the full context management and reliability domain (15% weight) where validation loops also appear.

Claude is designed to be helpful, harmless, and honest. When you give it a clear schema and explicit permission to express uncertainty, it will use that permission rather than hallucinate.

Anthropic , Claude Documentation (Model Card and Agentic Use)

What does a production-ready structured output system prompt look like?

Pulling all the techniques together, a production system prompt for a structured output agent in an MCP-integrated workflow looks like this:

text

You are a structured-output extraction agent. Your sole function is to extract
information from provided documents and return it as valid JSON.

RULES:
1. Always use <thinking> for reasoning. This block is discarded before delivery.
2. Always use <answer> for your final output. It must contain exactly one valid
   JSON object conforming to the schema below.
3. Do not include any prose outside <thinking> and <answer>.
4. If a required field cannot be determined from the provided context, set it to
   null and add the field name to the unknown_fields array.
5. Do not fabricate values. "I don't know" is expressed via null + unknown_fields.
6. The sources array must list every document you referenced in <thinking>.

SCHEMA:
{
  "summary": "string (minimum 2 sentences)",
  "confidence": "high | medium | low",
  "sources": ["string"],
  "key_figures": [{"label": "string", "value": "string"}],
  "unknown_fields": ["string"]
}

FAILURE MODE:
If you cannot produce a valid JSON object, emit:
<answer>{"error": "extraction_failed", "reason": "string"}</answer>

This prompt is deterministic in its structure, explicit in its failure mode, and schema-complete. It does not rely on the model inferring what "good output" looks like; it specifies it.

For deeper work on how tool descriptions interact with system prompts in MCP workflows, see our guide to writing effective tool descriptions and system prompt and description conflicts.

Frequently asked questions

What does 'claude van damme' mean in the context of AI engineering?

In the CCA-F exam community, 'claude van damme' is informal shorthand for hitting a structured output target with precision every time: no hallucinated keys, no reasoning bleed, no stray prose. It refers to the discipline of enforcing strict JSON or XML output from Claude agents through deliberate prompt design and validation.

How do I extract only the <answer> block from a Claude response programmatically?

Use a simple regex or string parser to find the content between <answer> and </answer> tags. Strip everything outside those delimiters before passing the text to your JSON parser or schema validator. If no <answer> block is found, treat the response as a validation failure and trigger a retry with error feedback.

Should I use JSON mode or XML-tag separation for structured output in Claude?

XML-tag separation is more flexible and works across all Claude models without requiring a specific API parameter. It also allows a <thinking> scratchpad, which improves reasoning quality. JSON mode (where available) is useful for simple schemas but does not separate reasoning from output. For complex agentic workflows, XML-tag separation is the more robust choice.

How many retries should a structured output validation loop attempt before failing?

Three retries is a reasonable default for most production workflows. Beyond three attempts, the failure is likely a schema design problem or a model capability boundary rather than a transient formatting error. Log the full conversation on failure so you can diagnose whether the root cause is a missing example, an ambiguous schema field, or a genuine model limitation.

Does the CCA-F exam include questions on structured output and XML-tag patterns?

Yes. Domain 4 (Prompt Engineering & Structured Output) carries 20% of the CCA-F exam weight. Expect scenario questions on XML-tag separation, few-shot example design, schema specification, and uncertainty handling. Domain 1 (Agentic Architecture & Orchestration, 27%) also covers validation loops and prompt chaining for synthesis steps.

How do I allow Claude to express uncertainty in structured output without breaking the schema?

Add an explicit unknown_fields array to your schema and instruct the model in the system prompt to set unknown values to null and list the field name in unknown_fields. Reinforce this with a few-shot example that demonstrates the uncertain case. This gives the model a schema-legal way to say 'I don't know' rather than hallucinating a value.

Why does structured output fail in Claude agents?

What is the XML-tag separation pattern and why does it work?

How do few-shot examples reinforce the XML pattern?

How do you prevent reasoning bleed in MCP-integrated workflows?

What system prompt role assignments improve format adherence?

How do you tune anti-laziness prompting without over-triggering tool calls?

Which verification patterns catch errors in autonomous agent tasks?

How does the CCA-F exam weight these techniques?

What does a production-ready structured output system prompt look like?

Frequently asked questions

People also ask