When the Rate Card Has Tiers

KellerAI

KellerAI Executive Brief · June 2026 · Frontier Tier Governance

When the Rate Card Has Tiers

Anthropic priced the frontier at exactly double the model you already run. Your bill is now a function of which tier you bought, what your workload triggers, and how many turns your agents burn.

On June 9, 2026, Anthropic released Claude Fable 5 at $10 per million input tokens and $50 per million output — exactly 2.0x Claude Opus 4.8 on every pricing dimension the company publishes. For the first time, one capability frontier ships with a tiered rate card. This brief explains what the premium buys, why caching and batching cannot make it go away, the new billing rules that ride along, and the one measurement that decides when the frontier tier pays. The evidence — every price, document, and field name — lives in the companion whitepaper.

Section 01

Three Numbers Now Decide Your Bill

Until last week, the rate card had one shape: each model, one published price. Our earlier paper in this lane, The Token You Didn't Count, showed why your bill stopped tracking that price anyway: token counts you do not control, metering on every surface, agent loops that multiply both. Through all of it, the rate card itself stood still. The Fable 5 release moves it. Anthropic now sells the frontier as a tier: one underlying model, Anthropic states, shipped as Fable 5 for general use and as Mythos 5 for vetted partners, both priced at double the company's workhorse, Claude Opus 4.8.

That makes your AI bill a function of three things, and you control only the first: which tier you bought, what your workload triggers in the safeguard classifiers, and how many turns your agents burn at whatever rate the tier set. The question that matters at budget time is no longer “what does the token cost.” It is “what does the outcome cost, at which tier.” The rest of this brief is how to answer it with your own numbers.

Section 02

Exactly 2.0x, on Every Dimension

Anthropic prices both new models identically: $10 per million input tokens, $50 per million output. The announcement anchors that number in one direction only — “less than half the price of Claude Mythos Preview,” a predecessor that was limited to roughly fifty vetted organizations and whose price appears in no public Anthropic document we could locate. The press uniformly chose the other anchor. TechCrunch called it double the price of Opus 4.8; The Decoder wrote that the price almost doubles; VentureBeat's independent comparison places Fable 5 as the most expensive major AI model available globally. Both anchors are arithmetically true. The difference is the comparison you are invited to make: a model you could never buy, or the one you already run.

So compare against the one you already run. Anthropic's own pricing table lists Opus 4.8 at $5 base input, $6.25 and $10 for the two cache-write tiers, $0.50 for cache reads, and $25 for output, per million tokens. The same table lists Fable 5 at $10, $12.50, $20, $1, and $50. Every published dimension is exactly 2.0x — not roughly, not on average. The exactness is good news for governance: the premium is one clean number, with no marketing blur to argue about. The frontier costs twice the workhorse. The only question left is whether it is worth twice as much for your work, and that question has a measurable answer.

Section 03

The Loop Burns the Doubled Rate

The predecessor paper's strongest claim matters most here: agentic costs multiply, they do not add. An agent loops — plan, act, observe, revise — and every turn is billable whether or not it advanced the task, so the bill is rate times tokens times turns. The tier premium is a new factor in that product, and it just set itself to 2.0x. The incidents are already on record at workhorse rates: Uber reportedly exhausted its full-year 2026 AI budget by April, and the FinOps Foundation found AI spend management going near-universal in a single year, with the spend described as functionally hard to forecast. Every one of those happened at half today's frontier price.

The bar for the frontier tier is easy to state. A more capable model may finish tasks in fewer turns — but at twice the rate per turn, a task that takes forty turns on Opus 4.8 must finish in fewer than twenty on Fable 5 to win on cost alone. Nothing on the rate card tells you whether your workload clears that bar. Only measurement does.

Caching, the strongest offset, cannot change the decision. Anthropic's cache multipliers are uniform across models, so doubling the base price doubles the whole row: a turn that is 90 percent cache reads bills $1.90 per million effective input tokens on Fable 5 and $0.95 on Opus 4.8 — still exactly 2.0x. The batch lane behaves the same way. Anthropic discounts both tiers by half, so the ratio survives there too, with one usable coincidence: a batched Fable 5 token costs what an interactive Opus 4.8 token costs. Caching and batching lower the absolute bill. They leave the tier decision exactly where it was.

Caching lowers the bill and leaves the ratio untouched. You cannot cache your way out of the tier decision.

The tier decision

Section 04

The Billing Rules That Ride Along

Fable 5 ships with a disclosed safeguard: requests its classifiers flag are refused or, where you opt in, handed to Opus 4.8. This series governs that mechanism in its own pair of papers. What belongs here is the money — because the billing rules are documented, and they are tier-aware in a way no previous rate card needed to be. Anthropic states that a request refused before any output is not billed at all. When a fallback runs, you pay for the model that actually serves the request: each attempt bills at the rates of the model that ran it, so the frontier premium is charged only when the frontier model actually answers. A new primitive, the fallback credit, refunds the cache cost of switching models — as far as we can determine, the first billing construct invented for the tier era.

The wrinkles sit one layer down. In an agent harness, a fallback is a model switch, each model keeps its own cache, and the switched turn re-reads your entire conversation history at uncached rates — the deeper the session, the bigger the one-time toll. After a switch, both surfaces keep you switched: Anthropic's consumer apps leave the model picker on Opus for the rest of the conversation, and the API routes later turns of a fallen-back conversation to the fallback model for about an hour. One trigger can convert a session's remaining turns to the workhorse tier. And on subscription plans, The Decoder reports that frontier turns draw usage credits at twice the rate, with no stated carve-out for turns the fallback model served. Your effective tier is not a setting. It is an emergent property of your content and the classifiers.

Section 05

When Does Frontier Pay

Every request now has a tier choice: Fable 5 at 2.0x, or Opus 4.8 directly. The rate card cannot make that choice. It knows the premium; it does not know your workload.

Anthropic reports that more than 95 percent of Fable sessions involve no fallback at all — a vendor figure, measured on pre-launch traffic, with sessions rather than requests as the unit, and nothing published at launch lets anyone outside Anthropic verify it. More important, it is a global average, and triggers concentrate by domain. SANS reported routine incident-response and forensics workflows auto-routing to Opus 4.8 in initial testing, and Anthropic itself describes the safeguard tuning as intentionally conservative. A security or life-sciences workload can live on the wrong side of the average — paying frontier subscription credits and harness cache tolls for answers the workhorse model produced. For that workload, routing straight to Opus 4.8 is cheaper and equivalent.

The decision artifact is the one our earlier paper already gave you, with one new column: cost per outcome, per tier. Anthropic reports large capability gains for Fable 5 on its own benchmark harness; no independent reproduction existed at launch. Those deltas are the hypothesis. Your measured cost per completed task, by tier, is the test. Route a workload to the frontier tier only while its measured cost per outcome at Fable 5 beats its cost per outcome at Opus 4.8 — and let that measurement, not the rate card and not the benchmark slide, decide which of your workloads are allowed to be expensive.

For the full argument — the 2.0x arithmetic verified dimension by dimension, the fallback credit and sticky-routing mechanics, the cache and batch invariance, the per-tier cost-observability design, and the SR 26-2, NIST, ISO, and EU AI Act mappings — read the companion technical whitepaper, The Cost Tier of Frontier Access .

End of brief

↑ Back to top

Twice the price, exactly

Context

The Finding

When the Rate Card Has Tiers

Three Numbers Now Decide Your Bill

Exactly 2.0x, on Every Dimension

The Loop Burns the Doubled Rate

The Billing Rules That Ride Along

When Does Frontier Pay