An engine that fans one request to many models needs two orthogonal dials: autonomy (how much human oversight a piece of work requires) and cost (how much money a model call may spend). Alembic encodes the first as the Tier ladder T0→T4, the second as a model registry priced per 1k tokens plus a fail-closed BudgetGuard. This lesson connects them: how a tier routes to the cheapest qualifying model, how every paid call is metered, and why a non-positive budget means "free tier only" — never "unlimited."
The Tier enum is exactly five rungs, cheapest-autonomy-first. They are about human oversight, not money:
| Tier | Meaning | Autonomous? |
|---|---|---|
| T0 | silent / fully autonomous, no human in the loop | yes (the silent path) |
| T1 | autonomous with lightweight logging | yes |
| T2 | autonomous, single reviewer notified | yes |
| T3 | autonomous, council review required | yes |
| T4 | PARK — withheld from autonomous execution; needs council + human | no |
Two facts make this fail-closed: DEFAULT_TIER = T4 (unclassified work parks, Lesson 26) and isAutonomous returns true only for T1–T3 (tier.ts:59). Separately, LOCAL is not a sixth tier — it's an orthogonal marker tagging work that must stay on local/$0 models "regardless of tier" (privacy- or cost-sensitive paths, tier.ts:36). A unit can be T2 and LOCAL.
Each routable model is a registry entry with an adapterId, a tier, and two prices — costPer1kInputUsd and costPer1kOutputUsd. There are 11 entries spread across the tiers (the map records this; representative values shown):
// packages/contracts/src/registry.ts — shape + representative rows { modelId: 'local-default', adapterId: 'local', tier: T0, in: 0, out: 0 } // $0 hermetic { modelId: 'local-extract', adapterId: 'local', tier: T1, in: 0, out: 0 } { modelId: 'qwen3.7-plus', adapterId: 'cliproxyapi', tier: T1, in: 0.00015,out: 0.0005 } { modelId: 'glm-5.2', adapterId: 'cliproxyapi', tier: T2, in: 0.0002, out: 0.0006 } // DEFAULT_MODEL_ID { modelId: 'gpt-5.5-xhigh', adapterId: 'cliproxyapi', tier: T3, in: 0.015, out: 0.045 } { modelId: 'claude-opus-4-8-max', adapterId: 'cliproxyapi', tier: T3, in: 0.03, out: 0.15 }
The split across tiers (per the complete-map §6): T0 = local-default, local-extract ($0); T1 = kimi-k2.7-code-highspeed, grok-composer-2.5-fast, qwen3.7-plus; T2 = deepseek-v4-pro, gemini-3.5-flash, glm-5.2, qwen3.7-max; T3 = gpt-5.5-xhigh, claude-opus-4-8-max. The local adapter keeps T0 at $0 for hermetic CI. (Model ids/prices are illustrative registry data and evolve; the shape is the contract.)
When a tier is chosen but no specific modelId is pinned, pickCheapestForTier selects the cheapest entry of that tier by combined per-1k cost. It's pure over the registry — a deterministic fold, no side effects:
// packages/contracts/src/registry.ts:145-156 export const pickCheapestForTier = (tier, registry = MODEL_REGISTRY) => { const candidates = Object.values(registry).filter((e) => e.tier === tier); if (candidates.length === 0) return undefined; return candidates.reduce((cheapest, entry) => { const entryCost = entry.costPer1kInputUsd + entry.costPer1kOutputUsd; // in + out const bestCost = cheapest.costPer1kInputUsd + cheapest.costPer1kOutputUsd; return entryCost < bestCost ? entry : cheapest; }); };
Routing decides which model; the BudgetGuard decides whether the call may happen at all. It's created with a hard USD cap and three methods — and its defaults are deliberately paranoid:
// packages/etl/src/budget.ts:120-152 (condensed) export const createBudgetGuard = (capUsd) => { const cap = Math.max(0, capUsd); // a non-positive cap clamps to 0 = free-tier only let spent = 0; return { check(estimate) { const projectedUsd = priceEstimate(estimate); if (projectedUsd === 0) return { ok: true, projectedUsd: 0, … }; // free call always passes if (roundUsd(spent + projectedUsd) > cap) return { ok: false, reason: 'budget_exceeded', … }; // would overshoot ⇒ BLOCK return { ok: true, projectedUsd, remainingUsd: remaining() }; }, record(spend) { spent = roundUsd(spent + costOf(spend)); return spent; }, }; };
cap = Math.max(0, capUsd)A non-positive cap doesn't mean "no limit" — it clamps to 0, which means free tier only: any call with projectedUsd > 0 is blocked, because 0 + anything > 0. So the safe default (no budget set) is the cheapest possible posture, never a runaway spend. Free-tier calls (projectedUsd === 0) always pass, which is why the whole funnel can run $0 and hermetic on the LOCAL adapter (Lesson 15).
The map states a subtle invariant: pricing is always applied. Look at record's helper costOf — even when a ModelRunResult carries a costUsd, a free-tier model is metered as 0 (isFreeTierModel(spend.modelId), budget.ts:110), and a paid model falls back to costForModel(modelId, usage) if no explicit cost is present. Spend is never guessed and never skipped: T2/T3 calls are metered to the registry price, so the running spent total is authoritative.
createBudgetGuard(0) (or a negative cap). What can run?cap = Math.max(0, capUsd) clamps to 0, and check passes a call only if projectedUsd === 0 or it fits under the cap. With cap 0, only free calls (LOCAL/T0) pass — fail-closed: the default posture spends nothing.modelId, which does pickCheapestForTier(T2) return, and is the choice deterministic?costPer1kInputUsd + costPer1kOutputUsd. It's pure over MODEL_REGISTRY (no Date/random), so the same registry always yields the same pick — which matters for replay (Lesson 28).ModelRunResult from a free-tier model carries a stray costUsd: 0.01. How does record meter it?costOf (budget.ts:109-111) returns 0 for a free-tier model before reading costUsd. Pricing is always applied from the model's tier/registry, so a stray field can't inflate the spend total — metering is authoritative, not advisory.Tier is exactly T0–T4. LOCAL is an orthogonal marker that pins work to $0 models regardless of its tier. A unit can be both T2 and LOCAL: autonomous-with-a-reviewer, but kept on local models for privacy or cost.