Claude Code Skills vs CLAUDE.md

In short: Skills and CLAUDE.md solve different problems. CLAUDE.md holds always-loaded, universal standards that should apply to every turn of every session, such as coding conventions. A skill holds an on-demand, task-specific procedure that should load only when that task comes up, such as a release or code-review workflow. Choosing between them is a question of frequency and scope, not of importance.

When to use Claude Code skills instead of CLAUDE.md

Knowing when to use claude code skills rather than CLAUDE.md is one of the most reliably tested judgement calls in Domain 3, because the two mechanisms look interchangeable and are not. Both let you give Claude standing guidance. The difference is temporal: CLAUDE.md content is loaded at the start of every session and stays active for everything you do, while a skill body is loaded only at the moment it is invoked and is otherwise absent. That single distinction drives every correct answer on this knowledge point.

Skills vs CLAUDE.md: CLAUDE.md is always-loaded memory for universal standards that should shape every turn. A skill is on-demand instruction for a specific task that should load only when that task arises. Match content to the mechanism by asking how often it should be active.

The trap is to reason about importance. People assume that critical instructions deserve to be in CLAUDE.md so Claude never forgets them, and that minor helpers can be skills. That instinct is wrong. A code-review checklist might be extremely important, but it is still task-specific: it should be present when you review and absent when you are writing a migration. Importance does not decide placement; frequency and scope do.

The decision rule

Reduce the choice to one question: should this content be active on every turn, or only when a particular task comes up?

Always active, applies broadly → CLAUDE.md. Coding conventions, naming standards, the project tech stack, architectural rules, the commands to run tests. These are facts and standards Claude should honour on every edit, so paying to load them every session is exactly right.
Active only for one kind of task → a skill. A release procedure, a code-review workflow, a database-migration runbook, a changelog generator. These are step-by-step procedures you reach for occasionally, so they should load on demand and stay out of context the rest of the time.

A useful tell from the documentation: when a section of CLAUDE.md has grown from a fact ("we use RESTful naming") into a procedure ("to cut a release, do these eight steps"), that section has outgrown CLAUDE.md and wants to become a skill. Facts belong in memory; procedures belong in skills.

Skill or CLAUDE.md: a placement decision tree

Loading diagram...

Placement follows from how often the content should be active, which is what the exam is really testing.

Why the token economics matter

The reason this distinction is not merely stylistic is cost and focus. Every line in CLAUDE.md is re-loaded on every session and consumes context whether or not it is relevant to what you are doing today. Stuff a rarely-used release runbook in there and you pay for it on the hundred days you are not releasing, and you dilute the model attention with instructions that do not apply. Move it to a skill and it costs almost nothing until the day you actually cut a release, at which point you invoke it and the full procedure loads. The on-demand loading model of skills is the lever; matching content to it is the skill an architect must demonstrate.

Universal standard

CLAUDE.md, loaded every session

Task procedure

skill, loaded on demand

Fact vs procedure

the quick test for where it goes

Worked example

A team CLAUDE.md has swollen to 400 lines. Half is genuine standards; the other half is a detailed pull-request review procedure and a step-by-step deploy runbook. Reviews have started to feel sluggish and Claude sometimes applies review-specific rules while writing fresh code.

Diagnose the file by sorting each block into fact or procedure. The coding conventions, the directory layout, the test command, the naming rules. These are standards that should shape every edit, so they stay in CLAUDE.md. The review procedure and the deploy runbook are different animals: they are sequences you invoke for a specific task, and right now they are loaded on every unrelated turn, which both wastes context and bleeds review-specific instructions into ordinary coding.

The remedy is extraction. Move the review procedure into .claude/skills/review/SKILL.md with a description like "run our pull-request review checklist; use when reviewing a diff", and move the deploy runbook into .claude/skills/deploy/SKILL.md with disable-model-invocation: true so it only fires when a human explicitly asks. CLAUDE.md shrinks back to pure standards, every session gets lighter, and the review rules stop leaking into code generation because they now load only when /review runs. Nothing was deleted and nothing became less important, the procedures simply moved to the mechanism whose loading behaviour matches how they are actually used.

A taxonomy you can sort on sight

The fastest way to internalise the rule is to practise classifying real configuration. Run each item through the fact-versus-procedure and always-versus-occasional tests, and the placement falls out.

"We use British spelling in all user-facing copy." A universal standard that should apply to every edit, so it belongs in CLAUDE.md.
"To publish a release, bump the version, regenerate the changelog, tag, and push." A multi-step procedure invoked occasionally, so it belongs in a skill.
"Run the test suite with pnpm test." A standing fact Claude should know on any turn, so CLAUDE.md.
"Walk a pull request through our nine-point accessibility review." A task-specific procedure, so a skill, ideally with a description so Claude can offer it when you start a review.
"Our API responses always use this error envelope." A convention that shapes any endpoint you touch, so CLAUDE.md.

Notice the pattern: short, durable facts and standards gravitate to CLAUDE.md, while named, sequenced procedures gravitate to skills. When you can sort a list like this without hesitating, you have the knowledge point.

Where path-specific rules fit alongside the two

There is a third mechanism worth holding in view, because exam questions sometimes offer it as a tempting middle option: path-specific rules in .claude/rules/ with a glob in their frontmatter. These load automatically, but only when you touch files matching the pattern, which makes them ideal for conventions that are universal within a slice of the codebase but irrelevant elsewhere, such as test conventions that should apply across every *.test.tsx file but nowhere else. They are neither always-on like CLAUDE.md nor manually invoked like a typical skill; they are conditionally-on. For this knowledge point, the clean mental model is a spectrum: CLAUDE.md is always loaded, path-specific rules are loaded for matching files, and skills are loaded on demand. Recognising that a requirement is "applies to a file type, not the whole project" is what tells you the answer is a path rule rather than a bloated CLAUDE.md or an awkward skill.

Refactoring a CLAUDE.md that has outgrown itself

When a CLAUDE.md has accumulated procedures, the cleanup is mechanical once you accept the rule. Read the file top to bottom and label each block as a fact or a procedure. Leave the facts where they are. For each procedure, cut it into a new skill folder under .claude/skills/, give the SKILL.md a description that captures when it should run, and add disable-model-invocation: true to any procedure with side effects so Claude only runs it when a human asks. Commit the new skills so the team inherits them, and verify that CLAUDE.md is back to pure standards. The win is concrete and measurable: a lighter file loads on every session, the model stops applying task-specific instructions to unrelated work, and the procedures are still one slash away when you need them. This is the everyday shape of the skills-versus-CLAUDE.md judgement, and it is exactly the situation the exam dramatises.

Common misreadings to avoid

Misconception

Put your most important workflows in CLAUDE.md so Claude always has them and never forgets a step.

What's actually true

Importance does not decide placement; frequency does. A workflow is a task-specific procedure, so it belongs in a skill that loads when invoked. Forcing it into CLAUDE.md loads it on every session, wasting context and bleeding task rules into unrelated work.

Misconception

Skills are the modern replacement for CLAUDE.md, so you should move everything out of CLAUDE.md into skills.

What's actually true

They are complementary, not a replacement. Universal standards that should apply on every turn still belong in CLAUDE.md; only on-demand procedures belong in skills. Moving always-on standards into a skill means they stop applying unless someone remembers to invoke them.

Two litmus tests you can apply instantly

When a scenario does not hand you an obvious label, two quick tests resolve almost every case. The first is the every-turn test: ask whether you would want this content active on a completely unrelated turn. If Claude is writing a database migration, should it still be holding your code-review rubric in mind? If the honest answer is no, the content is task-specific and belongs in a skill; if the answer is yes, it is a universal standard for CLAUDE.md. The second is the fact-or-procedure test: read the content and ask whether it states a fact ("we use snake_case for database columns") or describes a procedure ("to add a column, do these steps"). Facts and standards are CLAUDE.md material; named sequences of steps are skills.

These two tests usually agree, and when they do you can answer with confidence. On the rare occasion they seem to disagree, a standard that is also somewhat procedural, fall back to the cost question: would loading this on every session be a fair price for how often it is actually relevant? If you would resent paying that cost on the many days the content does not apply, it wants to be on demand, which means a skill. Training yourself to run these tests quickly is what turns this analyse-level knowledge point from a judgement call into a reflex, and it is exactly the reflex the exam is checking for when it shows you a misplacement and asks for the fix.

How this shows up on the exam

Domain 3 carries 20% of the weight, and TS 3.2 questions on this knowledge point are nearly always classification problems framed through Scenario 2 or Scenario 4: you are handed a piece of guidance and asked where it belongs, or shown a misplacement and asked to name the fix. The reliable wrong answers are a code-review workflow parked in CLAUDE.md and a set of coding standards buried in a skill. Anchor on the rule, universal and always-on goes in CLAUDE.md, task-specific and on-demand goes in a skill, and you can sort any example quickly. This analysis feeds directly into the broader command-and-skill design scenarios you will evaluate next.

Check your understanding

During an exam scenario, a team reports that Claude occasionally applies their detailed code-review rubric while generating brand-new features, and that every session feels heavy. Inspecting their setup, you find the entire review rubric pasted into CLAUDE.md. What is the correct remediation?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

When to Use Claude Code Skills vs CLAUDE.md