Exam guide·9 min read·18 June 2026

Anthropic Certification: Master All 5 Domains to Pass

The Anthropic certification (CCA-F) spans 5 weighted domains. Learn which to prioritise, what scenarios appear, and how to reach the 720 passing score.

By Solomon Udoh · AI Architect & Certification Lead

Anthropic Certification: Master All 5 Domains to Pass

The Anthropic certification that practitioners are chasing in 2026 is the Claude Certified Architect, Foundations exam (CCA-F), launched 12 March 2026 at $99 per attempt. It delivers 60 scenario-based questions scored on a 100-to-1000 scale, with 720 as the passing mark. As of 3 June 2026, more than 10,000 individuals have already cleared it. This guide maps every domain, weights the study time you should allocate, and flags the anti-patterns the exam consistently penalises.

What are the five CCA-F domains and how are they weighted?

The exam blueprint publishes five domains with explicit percentage weights. Those weights should drive your study calendar directly: spend roughly the same proportion of your prep time on each domain as the exam spends questions on it.

Domain	Topic	Weight
1	Agentic Architecture & Orchestration	27%
2	Tool Design & MCP Integration	18%
3	Claude Code Configuration & Workflows	20%
4	Prompt Engineering & Structured Output	20%
5	Context Management & Reliability	15%

Domain 1 is the single heaviest domain at 27%, which means roughly one in four questions will test agentic loop design, multi-agent coordination, and orchestration patterns. Domains 3 and 4 are tied at 20% each, making them collectively the largest block. Domain 5 is the lightest at 15%, but its reliability scenarios carry outsized consequence in production contexts, so do not skip it.

Our Claude Certification Concepts library maps all 174 atomic concepts to these five domains and their 30 task statements, which is a useful cross-reference as you work through each section below.

How does Domain 1 (Agentic Architecture) show up on the exam?

Domain 1 accounts for 27% of the exam and is the area where most candidates lose points. The questions are almost always scenario-based: you are given a broken or underspecified agentic loop and asked to diagnose the failure or choose the correct fix.

The most common failure mode the exam tests is checking for task completion by inspecting model-generated text rather than using structured signals. A loop that reads if "task complete" in response_text is fragile; a loop that inspects stop_reason and parses a typed result object is robust. The exam rewards the latter consistently.

python

# Fragile: text-based termination check
if "task complete" in response.content[0].text.lower():
    break

# Robust: structured stop_reason inspection
if response.stop_reason == "end_turn" and result.status == "complete":
    break

Key concepts to master for this domain include Agentic Loop Anti-Patterns, Hub-and-Spoke Architecture, and Parallel Subagent Spawning. The exam also tests when a coordinator should select subagents dynamically versus following a pre-configured route, a distinction covered in Model-Driven vs Pre-Configured Decision Making.

Multi-agent questions frequently involve a coordinator that must pass structured context to subagents without losing attribution. Structured Context Passing and Diagnosing Attribution Loss in Synthesis are both testable task statements in this domain.

The exam consistently rewards deterministic solutions over probabilistic ones when stakes are high, proportionate fixes, and root-cause tracing.

Anthropic , CCA-F Exam Facts

How does Domain 2 (Tool Design & MCP Integration) show up on the exam?

At 18%, Domain 2 is the third-largest domain. Its questions cluster around three themes: writing tool descriptions that route correctly, handling structured errors from MCP servers, and deciding how many tools to assign per agent.

Tool description quality is the lever the exam tests most. A vague description causes the model to misroute calls; a precise, action-oriented description with explicit scope constraints routes correctly. The fix is almost always low-effort and high-leverage: rewrite the description rather than restructure the architecture.

json

{
  "name": "search_customer_orders",
  "description": "Search orders for a specific customer by customer_id. Use ONLY for order lookup. Do NOT use for product catalogue queries or inventory checks.",
  "input_schema": {
    "type": "object",
    "properties": {
      "customer_id": { "type": "string" },
      "date_range_days": { "type": "integer", "default": 30 }
    },
    "required": ["customer_id"]
  }
}

The MCP isError Flag Pattern is a specific testable concept: MCP servers signal tool-level errors by setting isError: true in the result rather than throwing an exception. Candidates who conflate these two error channels consistently choose wrong answers on error-propagation questions.

Tool overload is another recurring scenario. When an agent has access to too many tools, selection quality degrades. The exam tests whether you can identify this as the root cause and apply Tool Splitting for Specificity or constrained tool_choice configuration as the fix.

How does Domain 3 (Claude Code Configuration) show up on the exam?

Domain 3 carries 20% of the exam and focuses on the three-level CLAUDE.md configuration hierarchy, custom skills, plan mode, and CI/CD integration patterns. Many candidates underestimate this domain because it feels operational rather than architectural. The exam disagrees.

The three configuration levels are: user-level (global defaults), project-level (CLAUDE.md at the repository root), and directory-level (CLAUDE.md files in subdirectories). Each level can override the one above it for path-specific rules. Questions test whether you know which level to modify for a given scope requirement and what the version-control implications are of each choice.

bash

# Project-level CLAUDE.md at repo root
/project-root/CLAUDE.md

# Directory-level override for a specific service
/project-root/services/payments/CLAUDE.md

# User-level config (not version-controlled)
~/.claude/CLAUDE.md

Plan mode questions ask when to require explicit user approval before execution. The exam pattern is: high-irreversibility actions (file deletion, deployment, schema migration) warrant plan mode; low-risk read operations do not. The exam penalises both over-use (slowing safe workflows) and under-use (executing destructive actions without review).

CI/CD questions typically involve the -p flag for non-interactive execution and structured JSON output piped to downstream steps. Expect at least one scenario where a candidate must choose between interactive and non-interactive modes for a given pipeline stage.

How does Domain 4 (Prompt Engineering & Structured Output) show up on the exam?

Domain 4 is tied with Domain 3 at 20% and covers explicit criteria design, few-shot prompting, JSON schema construction, and validation-retry loops. The exam is particularly focused on when each technique is necessary rather than merely helpful.

Few-shot examples are the highest-leverage technique for ambiguous or edge-case inputs. The exam tests whether you can identify scenarios where zero-shot instructions are insufficient and a small number of well-chosen examples would resolve the ambiguity. Critically, the examples must be representative of the failure cases, not just the happy path.

text

System: Classify customer sentiment as POSITIVE, NEGATIVE, or NEUTRAL.
Return JSON: {"sentiment": "...", "confidence": 0.0-1.0}

Examples:
User: "The product arrived on time."
Assistant: {"sentiment": "POSITIVE", "confidence": 0.92}

User: "It works, I guess."
Assistant: {"sentiment": "NEUTRAL", "confidence": 0.71}

User: "Third time contacting support for the same issue."
Assistant: {"sentiment": "NEGATIVE", "confidence": 0.88}

Validation-retry loops are a structured output reliability pattern the exam tests directly. When a model returns malformed JSON or a value outside the allowed enum, the correct response is to feed the error back with the original prompt and retry, not to silently accept the output or escalate immediately.

Multi-pass review architecture appears in Domain 4 as well: a single review pass has known limitations (self-review bias, attention dilution in long contexts), and the exam tests whether candidates know when to deploy independent review instances or sequential passes with different criteria.

Our Prompt Engineering & Structured Output concept section covers the full task statement list for this domain.

How does Domain 5 (Context Management & Reliability) show up on the exam?

Domain 5 is the lightest domain at 15%, but its questions are among the most practically consequential. They test stale context detection, session management decisions, summary injection, and structured handoff to human agents.

The stale context problem arises in long-running sessions: information injected early in a conversation degrades in influence as the context window fills. The exam tests whether candidates can identify this as the root cause of reliability failures and apply the correct fix (summary injection into a fresh session rather than continuing to extend the existing one).

python

# Inject a structured summary when starting a fresh session
system_prompt = f"""
You are continuing a long-running analysis task.

## Confirmed findings so far
{json.dumps(prior_findings, indent=2)}

## Remaining scope
{remaining_scope}

Continue from this state. Do not re-investigate confirmed findings.
"""

Session management questions ask candidates to choose between resuming an existing session, forking it for divergent exploration, or starting fresh with a summary. The decision rule is: resume when context is still valid and relevant; fork when you need to explore an alternative without losing the main thread; start fresh when the context has degraded or the task scope has shifted significantly.

Structured handoff to human agents is a reliability pattern that appears when the model encounters a situation outside its authorised scope. The exam tests whether the handoff includes sufficient structured context for the human to act without re-reading the entire conversation history.

When in doubt, prefer the minimal footprint: request only necessary permissions, avoid storing sensitive information beyond immediate needs, prefer reversible over irreversible actions.

Anthropic , Claude Documentation

What scenario types appear most frequently across all domains?

The exam uses six recurring scenario archetypes. Recognising the archetype quickly lets you apply the right diagnostic framework rather than reasoning from scratch.

Scenario Type	Primary Domain(s)	Key Diagnostic
Support agent with escalation	1, 5	Structured handoff vs. silent failure
Code generation and review	3, 4	Multi-pass review, plan mode gates
Multi-agent research pipeline	1, 2	Attribution preservation, coordinator routing
CI/CD automation	3	Non-interactive mode, structured output
Structured data extraction	4	Schema design, validation-retry loop
MCP tool integration	2	Error signalling, description quality

Across all six archetypes, the exam applies a consistent scoring philosophy: deterministic, programmatic enforcement beats prompt-only rules when stakes are high; proportionate fixes beat architectural overhauls for isolated failures; root-cause tracing beats symptom suppression.

How should you allocate study time across the five domains?

A straightforward approach is to mirror the domain weights directly. For a 40-hour study plan, that maps as follows:

Domain	Weight	Hours
Agentic Architecture & Orchestration	27%	10.8
Claude Code Configuration & Workflows	20%	8.0
Prompt Engineering & Structured Output	20%	8.0
Tool Design & MCP Integration	18%	7.2
Context Management & Reliability	15%	6.0

Within each domain, prioritise scenario-based practice over passive reading. The exam's 60 questions are all scenario-based with one correct answer and three plausible distractors. The distractors are designed to be attractive to candidates who know the concept but have not applied it to a realistic situation.

AI Skill Certs (independent of Anthropic) offers 60-question practice exams scored on the same 100-to-1000 scale with 720 as the passing bar. The adaptive engine uses Bayesian Knowledge Tracing with a 0.90 mastery threshold, which means it continues surfacing a concept until you demonstrate reliable recall, not just a single correct answer. Archie, the platform's Socratic tutor, guides you through the reasoning behind each distractor rather than simply confirming the correct choice.

The Agentic Architecture & Orchestration and Tool Design & MCP Integration concept sections are good starting points given their combined 45% share of the exam.

Frequently asked questions

How much does the Anthropic CCA-F certification exam cost?

The CCA-F exam costs $99 per attempt. Tiered Anthropic partners receive a discounted first attempt through the Claude Partner Network. There is no published limit on retake attempts, but each retake is charged at the standard rate.

What is the passing score for the Claude Certified Architect Foundations exam?

The passing score is 720 on a 100-to-1000 scale. Anthropic does not publish the raw-to-scaled conversion formula, so it is not accurate to state an exact number of questions required to pass. On a linear reading, 720 corresponds to roughly 41 to 42 of 60 questions, but the actual conversion may differ.

How many questions are on the CCA-F exam and what format are they?

The exam contains 60 scenario-based multiple-choice questions. Each question has one correct answer and three plausible distractors. There are no free-response or essay questions. The exam can be taken online-proctored or at a physical test centre.

Which domain is the hardest on the Claude Certified Architect exam?

Domain 1 (Agentic Architecture & Orchestration) is the heaviest at 27% and is where most candidates report losing the most points. Its questions require diagnosing broken agentic loops and choosing between orchestration patterns, which demands applied reasoning rather than factual recall.

Is AI Skill Certs affiliated with or approved by Anthropic?

No. AI Skill Certs is an independent exam preparation platform. It is not affiliated with, endorsed by, or approved by Anthropic. The platform's practice exams and concept library are independently produced to help candidates prepare for the CCA-F exam.

When was the Claude Certified Architect Foundations exam launched?

The CCA-F exam launched on 12 March 2026. It is Anthropic's first professional certification and is part of the Claude Partner Network, a $100 million programme. As of 3 June 2026, more than 10,000 individuals had already earned the certification.