The Attention Dilution Problem

In short: Attention dilution is the failure mode where asking Claude to process too many items in a single pass spreads its attention thin, producing inconsistent depth across them. The fix is to split the work into focused per-item passes plus a separate cross-item integration pass.

What the attention dilution problem is

Attention dilution is a diagnosis, not a vague complaint about quality. It names a specific failure: when you ask Claude to analyse too many items in a single pass, the depth it applies to each one becomes inconsistent. The model does not refuse and it does not run out of context window. It simply cannot give twenty files the same careful attention it would give two, so a few get a thorough read and the rest get a shallow skim, and you cannot predict which is which.

This is the analyse-level knowledge point in task statement 1.6, and the exam treats it as a reasoning skill. You are expected to look at a described symptom and recognise that the root cause is attention being spread thin across a large input, then name the structural remedy. Recognising the cause is the whole point; the fix follows from it.

Attention dilution: A failure mode in which processing too many items in a single model pass yields uneven analytical depth across those items. Every item remains inside the context window, but the model's attention is distributed unevenly, so quality is inconsistent rather than uniformly high or low.

The mechanism: even quality is the casualty

The intuitive but wrong mental model is that a bigger input makes the model uniformly worse. What actually happens is subtler and more dangerous: the average quality may look acceptable while the consistency collapses. One file is reviewed brilliantly, the next is barely touched, and a third in the middle of the batch is essentially glossed over. Because the average looks fine, the problem hides until you notice two contradictory verdicts.

That inconsistency is what makes attention dilution hard to catch with a casual glance at the output. A single deep finding can lull you into trusting the whole pass, when in fact the depth that produced that finding was not applied evenly. The skill the exam rewards is refusing to be reassured by one good result and asking instead whether every item got the same treatment.

too many

items in one pass is the trigger

uneven depth

not uniform failure: inconsistent quality

in-context

the data is present; attention is the bottleneck

The diagnostic symptom you must recognise

There is one symptom the exam keys on above all others: a review that flags a pattern as problematic in one file while approving identical code somewhere else. If the model genuinely understood the pattern to be a problem, it would flag it everywhere. The contradiction proves the model is not analysing each file with equal depth. It caught the issue where its attention landed and missed it where attention had thinned out.

Treat that contradiction as a fingerprint. When you see inconsistent verdicts on equivalent inputs, the diagnosis is attention dilution, not a flaky model or a prompt-wording problem. The cure is not a sterner instruction to "be thorough", telling a model to pay attention does not redistribute attention. The cure is to change the structure of the work so that each item gets a pass of its own.

Keep one nuance in mind: the contradiction need not be on byte-identical code to count as the fingerprint. Equivalent patterns, the same risky construct expressed two slightly different ways, flagged in one place and waved through in another point to the same uneven attention. So train yourself to spot the category of issue being treated inconsistently, not just literally matching lines. That broader pattern recognition is what catches dilution in real reviews, where duplicated logic is rarely a perfect copy.

One overloaded pass versus focused passes

Loading diagram...

Splitting one overloaded pass into focused per-file passes restores consistent depth; a separate integration pass then catches cross-file issues.

Why the fix is structural, not motivational

The instinctive response to inconsistent depth is to write a sterner prompt, "review every file with equal care, skip nothing." It rarely works, and understanding why is the core insight of this knowledge point. Attention dilution is not a question of effort or willingness; it is a consequence of how a single pass distributes a finite resource. Telling the model to care more does not give it more attention to spend. It still has to spread the same attention across the same overloaded input, so the unevenness comes straight back.

A related effect makes the structural nature unmistakable. Quality is not uniform across the position of an item in a long input: material near the start and end of a large context tends to be handled more reliably than material buried in the middle. Anthropic's own long-context prompting experiments document this position sensitivity, reporting that recall varies with where the relevant passage sits in the prompt rather than only with how much text is present. That is why the file that gets skimmed is so often one in the middle of a big batch, not because it mattered less, but because of where it landed in the sequence. No instruction changes where the items sit; only restructuring the work does. Splitting into focused per-file passes removes the long input entirely, so there is no "middle" for a file to get lost in.

This is also why reaching for a more capable model or a longer context window does not, on its own, solve the problem. A bigger window lets you fit more files into one pass, but fitting them is not the same as analysing each with equal depth. It can even make dilution worse by tempting you to cram more in. The lesson the exam wants is that the cure operates on the structure of the work, not on the prompt wording or the size of the context. Once you accept that attention is the scarce resource, the multi-pass split stops looking like extra machinery and starts looking like the obvious way to guarantee each item its share.

There is a useful diagnostic discipline that follows from this. When you suspect dilution, do not ask whether the model can find an issue. You already know it can, because it found that issue somewhere. Ask instead whether the analysis was applied evenly. The moment you frame quality as a question of consistency rather than capability, the right experiment becomes obvious: give one suspect file its own pass and see whether the verdict changes. If it does, dilution was the cause, and the structural fix is confirmed.

Why a separate cross-file pass is still needed

Splitting into per-file passes fixes the consistency problem, but it introduces a new blind spot: issues that only exist between files. A function defined in one file and misused in another, a data-flow contract broken across a module boundary, or a dependency cycle is invisible to any single per-file pass, because no one file contains the whole problem. That is why the remedy is two-part, not one.

The per-file passes catch local issues consistently; a dedicated cross-file integration pass then catches the relationships none of them could see alone. Anthropic's parallelization workflow describes exactly this sectioning idea, break the work into focused subtasks, while its subagent model lets each focused pass run in its own context window so nothing floods the others. The concrete two-phase implementation is developed in the per-file and cross-file pass pattern, and the same logic scales up into a multi-pass review architecture in Domain 4.

Worked example

A security review agent is asked to audit a 30-file service for input-validation flaws in a single pass. It returns a clean bill of health for most files but flags a SQL string concatenation in one.

An architect reading the output spots something off. The agent flagged unsanitised string concatenation in reports.py as a SQL-injection risk, correct, but approved invoices.py, which builds a query with the exact same concatenation pattern. Identical code, opposite verdicts.

The diagnosis is attention dilution, and the contradiction is the proof. The agent does understand the vulnerability; it caught it once. What it failed to do was apply that understanding with equal depth to all 30 files in one pass. Somewhere in the middle of the batch, invoices.py received a shallow skim, and the same flaw sailed through. No amount of re-prompting the single pass to "check carefully" reliably fixes this, because the issue is structural, not motivational.

The team's first instinct was exactly that re-prompt: re-run the single pass with a firmer instruction to review all thirty files with equal rigour and miss nothing. The result was almost identical, a different middle file was now skimmed, and invoices.py was still approved. That non-fix was itself confirmation. If a stronger prompt could redistribute attention, the second run would have caught the pattern everywhere; it did not, because the constraint is structural. Only when the audit was restructured did the failure go away.

The remedy is to restructure the audit. Each file gets its own focused pass so the validation check is applied with consistent depth everywhere, now both reports.py and invoices.py are flagged. Then a separate cross-file pass looks for vulnerabilities that span files, such as a sanitiser defined in one module being bypassed by a caller in another. The single diluted pass could deliver neither guarantee; the structured passes deliver both.

Common misconceptions

Misconception

A single pass over many files produces uniform quality as long as everything fits in the context window.

What's actually true

Fitting in the context window is not the issue. Attention is distributed unevenly across a large input, so depth varies item to item even when all items are present. The result is inconsistent quality, not uniform quality, which is exactly why the failure is easy to miss.

Misconception

If the model flags an issue in one file but not in an identical one, the model just does not understand the issue.

What's actually true

The opposite: catching it once proves the model understands it. The inconsistency is the signature of attention dilution, the understanding was not applied with equal depth across every file in the overloaded pass. The fix is structural (focused per-file passes), not a better explanation of the issue.

How this shows up on the exam

Because this is an analyse-level knowledge point, exam questions hand you a symptom and ask for the cause and cure rather than a definition. The classic stem describes an agent that reaches contradictory conclusions about equivalent inputs, flagging a pattern in one place, approving it in another, and asks why. The answer is attention dilution from overloading a single pass, and the correct remedy is splitting into focused per-file passes plus a cross-file integration pass. Distractors will tempt you toward "the model lacks knowledge," "raise the temperature," or "write a stricter prompt," none of which address uneven attention. Anchor on the fingerprint, inconsistent depth on equivalent items, and the structural fix follows directly, which is the bridge into choosing a decomposition strategy when input volume is the deciding factor.

Check your understanding

A code-review agent audits 25 files in one pass. It marks a missing-null-check pattern as a bug in module A but approves module B, which contains the identical pattern. What is the most likely root cause and the correct fix?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

The Attention Dilution Problem: Why Single-Pass Analysis Misses Issues