The Persistent Case Facts Block

In short: A case facts block is a small, structured section of extracted transactional facts (amounts, dates, identifiers, and stated expectations) that you assemble into every request to the model and deliberately exclude from any summarisation. Because it is rebuilt from durable storage rather than recovered from the conversation, it stays complete and exact no matter how long the interaction runs.

Designing a case facts block

The previous knowledge point, the progressive summarisation trap, proves that lossy compression and precise action are in tension. A case facts block resolves that tension structurally. Instead of hoping the exact figures survive a recap, you lift them out of the conversation entirely and maintain them as a small, durable record that you reassemble into every request. The conversation transcript can be summarised freely, because the facts that matter no longer live in it.

Think of it as splitting context into two streams with different rules. One stream is the narrative: greetings, clarifications, the model's explanations, everything that is safe to compress. The other stream is the ledger: a tight list of facts the agent will eventually have to use to take an action. The ledger is short, it is exact, and it is immune to summarisation by policy, not by accident. Getting this separation right is an apply-level skill, which is why this knowledge point sits a notch harder than the trap it solves.

Persistent case facts block: A compact, structured section of the prompt holding the exact transactional facts of a case, rebuilt from durable storage on every turn and explicitly excluded from any summarisation step.

The anatomy of the block

A good block is small and predictable. It is not a second transcript; it is a schema with the case filled in. For a support interaction the contents are usually four kinds of field, and keeping them named makes them easy to read and easy to update.

Identifiers that anchor the case: order number, ticket ID, account reference. These are the keys downstream tools will need.
Monetary and quantitative values such as the disputed amount, the balance, or the quantity affected, recorded to the exact figure rather than a rounded approximation.
Dates and deadlines including the order date, a promised callback, or the edge of a return window, since policy decisions often hinge on these.
Stated expectations and commitments, for example the resolution the customer asked for and anything the agent has already promised, so the agent stays consistent with its own prior turns.

The discipline is to record each field once, in a stable place, the instant it is established. You are not asking the model to remember; you are building a record the model reads fresh every turn.

Where the block lives in the prompt

Position matters as much as content. The block should sit as a clearly delimited, labelled section near the top of the prompt, kept visually and structurally distinct from the conversation history. Putting it near the top is not cosmetic: long inputs are read most reliably at their beginning and end, a property explored in the lost in the middle effect, so a top-anchored facts block lands where attention is strongest.

Crucially, the block is assembled separately from the transcript and travels alongside it, not inside it. When your summariser compresses older turns, it operates only on the transcript stream and never sees the block. Anthropic's compaction feature even exposes a control, pause_after_compaction, that lets you re-insert specific content after a summary is produced, which is the same idea expressed at the API layer: decide what bypasses compression and reattach it deliberately.

Two streams: compressible transcript and pinned facts

Loading diagram...

The ledger feeds a pinned facts block at the top of the prompt; the summariser only ever compresses the transcript.

Building and updating it in practice

Construction is an extract-and-maintain loop. When a turn introduces a durable fact, an extraction step appends or updates the corresponding field in the ledger. On the next request, your code reads the ledger, renders it as the facts block, and prepends it to the (possibly summarised) transcript. The agent therefore always sees the current, complete set of facts, regardless of how much of the chat has been compressed away.

Two habits keep the pattern healthy. First, keep it minimal: a facts block that swells to hundreds of lines defeats the purpose and starts competing for the very budget you were trying to protect. Include only what an action will need. Second, treat it as the source of truth for those fields. If the customer corrects an amount, update the ledger rather than relying on the model to notice the correction buried three summaries ago. This is the same instinct behind writing findings to a scratchpad or, at the platform level, using a memory tool that persists information in files outside the context window so it survives across turns and sessions.

Worked example

A returns agent must process a partial refund after a long, meandering conversation, and the team uses summarisation to control token growth.

On turn two the customer says they want to return two of the five items from order 8891, originally 247.83 dollars total, purchased on 3 March, and they expect roughly half back. An extraction step immediately writes a ledger: order 8891; total 247.83; purchase date 3 March; items returned 2 of 5; expected resolution partial refund. From this point on, every request to the model begins with a labelled CASE FACTS section rendering exactly those fields.

The conversation then wanders. The customer asks about exchange options, the agent explains the policy, they discuss shipping, and twenty turns later the transcript is long enough that the summariser compresses the first fifteen turns into two sentences. Those two sentences no longer contain the order number or the amount, and that is fine, because the agent is not reading the figures from the transcript. It is reading them from the facts block, which still says order 8891, total 247.83, 3 March, 2 of 5 items.

When the agent finally computes and issues the refund, it has everything it needs to do so correctly and to validate the return window against the 3 March date. Had the team relied on summarisation alone, this is precisely the moment the case would have collapsed into a vague refund for a recent order. The block did not make the conversation shorter; it made the conversation safe to shorten.

The platform-level version: persistent instruction files

The in-prompt block is the conceptual pattern, and Claude Code ships a concrete realisation of it. A project's CLAUDE.md instruction file is loaded into context at the start of every session and treated as durable project context rather than a turn in the transcript, so the rules and facts you place there survive summarisation and even carry across sessions. It is the same split this knowledge point teaches, a protected lane for the facts you cannot afford to lose, expressed as a file instead of a prompt section.

Anthropic also documents a separate auto-memory mechanism that records derived preferences and corrections, the things you explicitly ask it to remember, kept distinct from your hand-written instructions. The architectural lesson is to match the mechanism to the data. Stable, load-bearing facts belong in the persistent instruction file, learned preferences belong in auto memory, and bulky or fast-changing material belongs in referenced files you pull in on demand rather than in the durable block itself.

These files carry their own size limits, and they bite for the same reason the prompt block does. Anthropic's docs note that only the first 200 lines or 25 KB, whichever comes first, of the auto-memory file are loaded at the start of each conversation, so a critical fact parked below that cutoff is effectively absent when the session begins. Keep the durable record lean and put the load-bearing facts first, whether it lives in a prompt section or in a file on disk.

Common ways the block goes wrong

Teams that adopt a facts block still manage to undermine it, usually in one of a few predictable ways. The most common is letting the block drift out of date. If the customer corrects an amount or adds an item and only the transcript is updated, the block and the conversation now disagree, and the agent may act on the stale figure it trusts most. The discipline is that every correction updates the ledger first, so the block is always the authoritative copy rather than a snapshot that quietly rots.

A second failure is scope creep. A block that starts as five tidy fields slowly accretes notes, partial sentences, and speculative detail until it is a second transcript competing for the same budget. Resist this firmly: if something is narrative, it belongs in the compressible history, not the ledger. A third failure is silent omission, where a genuinely load-bearing fact is never extracted because no rule captured it. This is why it helps to derive the block's schema from the actions the agent can take. If an action needs a value, there must be a field for it, and the extraction step must populate that field whenever the value appears. Designing the block backwards from the actions, rather than forwards from the conversation, closes most of these gaps before they open.

The same pattern beyond customer support

Although the canonical example is a refund, the persistent facts pattern generalises to any long interaction with non-negotiable specifics. A data-extraction workflow that processes a contract over many turns can pin the parties, effective dates, and monetary terms so later questions are answered against exact values rather than a paraphrase. A research assistant tracking a long investigation can pin the precise claims and figures it has already confirmed, keeping them out of any rolling summary of its progress.

In each case the shape is identical: identify the handful of facts that downstream work will read as exact, lift them into a durable record, render that record into every prompt, and forbid the summariser from touching it. What changes between domains is only which fields count as load-bearing. Recognising the pattern as domain-general, rather than a support-desk trick, is what lets you reach for it instinctively whenever a task combines a long conversation with data that must stay precise. That instinct, more than any single implementation, is what this part of the exam is cultivating.

A helpful way to internalise the pattern is to treat the facts block as the agent's working memory of record, deliberately held outside the part of context that is allowed to change shape. Everything else in the prompt is a convenience that can be condensed; the block is the commitment that cannot. Hold that distinction clearly and most of the implementation details tend to follow on their own, because each decision reduces to a single question: is this something an action must read back exactly, or merely something the conversation happened to mention?

Misconceptions to retire

Misconception

The case facts block can just live inside the conversation history as an early assistant message.

What's actually true

If it lives in the transcript, the summariser will eventually compress it like any other turn, which reintroduces the exact failure it was meant to prevent. The block must be a separate stream, rebuilt from durable storage and excluded from summarisation.

Misconception

A bigger and more detailed facts block is always safer.

What's actually true

An oversized block competes for the same context budget you are protecting and dilutes attention with low-value detail. Keep it minimal and action-oriented: only the identifiers, figures, dates, and commitments an action will actually consume.

How this is assessed

On the exam this knowledge point shows up as a design choice rather than a recall question. A scenario describes a support agent that loses precision over long cases, and you are asked to choose the soundest remedy. The strong answer is the one that separates exact facts from compressible narrative and pins the facts into every prompt; the weaker answers either summarise more carefully (still lossy) or store facts inside the transcript (still exposed to compression). Because this is task statement 5.1 in the customer-support scenario, the assessment is checking that you can operationalise the principle from the progressive summarisation trap, not merely restate it.

Check your understanding

You are designing a support agent that must stay accurate over very long refund conversations while keeping token usage bounded. Which design best preserves exact amounts, dates, and order numbers?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

The Persistent Case Facts Block: Keeping Critical Data Out of Summaries