High Stakes Enforcement Decision Rule

In short: The high stakes enforcement decision rule is a simple test for choosing an enforcement mechanism: if a failure causes financial, security, or compliance consequences, enforce the rule programmatically with hooks or gates. If the only consequence is cosmetic, such as formatting or tone, a prompt is acceptable.

The high stakes enforcement decision rule in one question

Once you accept that prompts are probabilistic and code is deterministic, you need a fast way to decide which one a given rule deserves. The high stakes enforcement decision rule supplies it, and it collapses to a single question you ask of every rule: what is the consequence when this rule is violated? If the answer involves money leaving an account, access being granted incorrectly, regulated data being mishandled, or a compliance obligation being missed, the rule is high stakes and must be enforced programmatically. If the worst case is a slightly awkward greeting or a missing bullet point, the rule is low stakes and a prompt will do.

This framing matters because it shifts your attention from how likely a failure is to how costly it is. A rule the model obeys 99.9 percent of the time still fails one time in a thousand, and for a wire transfer that one failure can be catastrophic, whereas for a formatting preference it is invisible. The decision rule asks about the size of the downside, not the size of the probability.

High stakes enforcement decision rule: A classification test for picking an enforcement mechanism. Map the consequence of a rule failing: financial, security, or compliance damage requires deterministic enforcement (hooks, gates, code). Purely cosmetic damage allows probabilistic enforcement (a system prompt).

It is worth stressing that the rule keys on the worst plausible outcome, not the average one. Averages hide tail risk: an agent can handle thousands of harmless requests and still, on one unlucky run, move a large sum to the wrong place or expose data it should have guarded. That rare tail is precisely what the rule exists to govern. Reasoning in worst cases rather than typical cases is what stops you rationalising a prompt for an operation whose single rare failure is the only failure that matters. If you find yourself defending a prompt because "it almost always behaves," you have already conceded that it sometimes does not, which for a high stakes action is the end of the argument.

The three high stakes categories to memorise

Most exam scenarios that hinge on this knowledge point fall into three buckets, and naming them quickly is the whole skill.

Financial, anything that moves or commits money: refunds, payouts, transfers, discounts, credit limits, billing changes. A wrong action here has a direct monetary cost.
Security, anything that grants, checks, or revokes access: identity verification, permission changes, secret retrieval, account recovery. A wrong action here exposes data or hands control to the wrong person.
Compliance, anything governed by law or policy: data deletion under privacy rules, regulated disclosures, audit logging, age or jurisdiction checks. A wrong action here creates legal and regulatory exposure.

If a scenario touches any of these three, the correct enforcement mechanism is code. Everything outside them, the tone of a reply, the order of paragraphs, whether a summary uses bullet points, is fair game for a prompt.

Financial

money moves, so enforce in code

Security

access changes, so enforce in code

Compliance

law applies, so enforce in code

How the exam weaponises this rule

The Claude Certified Architect exam loves this knowledge point because it is easy to dress a wrong answer in convincing clothes. A question will describe a high stakes operation, then offer four solutions where three are sophisticated prompt based ideas: a more detailed system prompt, a curated set of few-shot examples, a routing classifier that flags risky requests, and finally one option that adds a deterministic gate. Every prompt based option sounds like real engineering, and a tired candidate picks the most elaborate one.

The decision rule cuts through the theatre. Because the scenario is financial, security, or compliance, all three prompt based options share the same fatal property: they only reduce the failure rate. The classifier is itself a probabilistic model, so it can miss; the examples bias behaviour without bounding it; the longer prompt is still a request, not a wall. The single deterministic option is the answer regardless of how plain it looks beside its rivals.

Applying the decision rule to an incoming rule

Loading diagram...

One branch decides everything: classify the consequence first.

Worked example: classifying two rules in the same agent

The rule is not applied to a whole agent; it is applied rule by rule, because a single agent usually mixes stakes.

Worked example

A payments support agent has two rules: it must greet customers warmly, and it must never execute a transfer over 10,000 dollars without dual approval. Apply the decision rule to each.

Start with the greeting rule. What happens if the agent forgets to be warm and opens with a curt one liner? A customer feels mildly underwhelmed. There is no money, no access change, and no legal exposure, so the consequence is cosmetic. The greeting rule is low stakes and belongs in the system prompt, where an instruction like "open every reply with a friendly, personal greeting" is entirely sufficient. If it slips occasionally, nothing breaks.

Now take the transfer rule. What happens if the agent executes a 50,000 dollar transfer without the second approval? Real money moves to a possibly wrong destination, and the failure is irreversible. This is squarely financial, so the decision rule says enforce it in code. The implementation is a PreToolUse hook on the transfer tool that denies any call above the threshold unless a dual-approval flag is present. Notice the payoff: the same agent runs a prompt for the cheap rule and a hard gate for the expensive one, spending engineering effort exactly where the downside justifies it. Getting this split right is what the prerequisite gate design knowledge point then teaches you to build.

Misconceptions the rule is built to defeat

Misconception

A routing classifier that detects risky requests is a deterministic safeguard, so it satisfies the high stakes requirement.

What's actually true

A classifier is itself a probabilistic model, so it can misclassify and let a dangerous request through. It is an improvement over a bare prompt but still sits on the probabilistic side of the line and does not meet the bar for high stakes operations.

Misconception

If an operation is high stakes, the whole agent must be locked down with code and prompts become useless.

What's actually true

The rule is applied per rule, not per agent. A single agent commonly enforces its financial rule with code while handling tone and formatting with a prompt. You spend determinism only where the consequence demands it.

Walking the rule across an order-management agent

The rule becomes second nature when you run it across a single agent's full menu of actions. Take an order-management assistant. It can process a chargeback, reset a customer's password, update their marketing email preference, reword an apology, and apply a loyalty discount. Five actions, and the decision rule sorts them in moments. A chargeback moves money, so it is financial and demands code. A password reset changes who can access an account, so it is security and demands code. Those two get gates.

The remaining three reveal the other half of the rule. Updating a marketing email preference harms no one if it briefly lags, so it is cosmetic and rides on a prompt. Rewording an apology is pure tone, equally a prompt. The loyalty discount is the interesting one: if the discount is capped and pre-approved, applying it is low risk and a prompt suffices, but if the agent can invent an arbitrary discount, money is suddenly in play and the cap itself must be enforced in code. The lesson is that you classify the action and its consequence, not the feature it belongs to, because the same agent legitimately mixes both kinds of rule.

Consequence, not frequency, is the trigger

A frequent mistake is to reason from how often an action happens rather than how much a single failure costs. Teams sometimes argue that a rare operation is not worth a gate because it almost never runs, or that a constant cosmetic behaviour deserves heavy enforcement because it is everywhere. Both invert the rule. The decision rule keys on consequence severity alone: a once-a-month wire transfer still moves real money, so it is high stakes despite its rarity, while a greeting that runs on every single message is still cosmetic and still belongs in a prompt.

Frequency does matter for one thing: it tells you how often a residual failure rate will bite, which can sharpen the urgency of adding a gate. But it never changes the category. When a stem tempts you with how seldom or how often an action occurs, treat that as noise and return to the only question that decides the mechanism, namely what one failure would cost.

Reading stakes that the scenario hides

The hardest version of this question disguises a high stakes action as something mundane. An agent that "looks up a customer detail" may be reading protected personal data, which is a compliance action. An agent that "records what it did" may be writing an audit log a regulator will later inspect, which is a compliance action too. An agent that "confirms the customer is eligible" may be performing the exact access check that everything downstream depends on, which is security. None of these announce themselves with the words financial, security, or compliance.

The countermeasure is to translate every described action into its underlying effect before you classify it. Ask what data is touched, what changes in the world, and who could be harmed if it went wrong. Once an innocuous-sounding step is restated as "discloses regulated personal information" or "irreversibly moves funds," the rule fires cleanly and the deterministic option becomes obvious. Learning to perform that translation under time pressure is exactly what the evaluate-level workflow enforcement scenario analysis drills.

How this knowledge point is tested

Because task statement 1.4 sits inside the customer-support resolution scenario, expect a vignette: an agent that issues refunds, verifies accounts, or processes transfers, and a subtle failure that auditors discover. The Bloom level here is apply, so you are not asked to recite the rule but to use it: read the consequence in the scenario, classify it, and select the mechanism that matches. The wrong answers will be plausible prompt based mitigations, and the right answer will be the lone deterministic control.

The fastest route to a correct response is to ignore how clever each option sounds and ask the decision rule's one question about the operation in the stem. If money, access, or law is on the line, eliminate every prompt, every example set, and every classifier in a single sweep, then pick what remains. Practise that reflex on the open-ended workflow enforcement scenario analysis and it becomes automatic.

Check your understanding

A healthcare scheduling agent has four behaviours under review. Which one should be enforced with deterministic code rather than a system prompt?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

High Stakes Enforcement Decision Rule for Claude Agents