Before any fusion, you need the host. Alembic is a plan-execution engine for agent swarms: it takes a goal, an executable plan, and a validation contract, then runs units of work across many models — routing by cost, proving each step, gating before it ships. This lesson is its shape: six layers, one load-bearing contract, four invariants, and the funnel that turns raw sources into Learnings.
Counts are source-verified. The map records 19 workspace packages + 1 app; the test suite was at 415 when the map was written and grew to ~565 after @alembic/hermes landed — the same 565 the case study in Lesson 6 runs.
Alembic is layered top-to-bottom, and the layering is real: the import graph has no upward edges and no cycles. Each layer may only depend on the ones below it. Read it from the bottom — the foundation is the vocabulary, the top is where humans and tools plug in.
| Layer | What it owns | Packages |
|---|---|---|
| L4 · CLIENTS | The surfaces humans and tools use: CLI, the HTTP+SSE harness server, a read-only MCP server, the web cockpit, the TUI. | harness, web, tui, apps/cli |
| L3 · SWARM | Multi-tier orchestration: an orchestrator spawns a lead that spawns workers, over a dependency-gated task queue, depth-bound, with git-worktree isolation and crash-safe resume. | swarm |
| L2 · ENGINE | The decision kernel: a qualitative DebateEngine, 0–10 quantitative scoring, an independent maker-checker Verifier, and an N-lens panel that is the T3+ emission gate. | council |
| L1 · ADAPTER | The narrow waist — every model call is one function shape that never throws. Six adapters + offline + a router with no silent fallback, retry, circuit-breaker, cost accounting. | adapters |
| L0 · SUBSTRATE | The deterministic, $0 floor: the Zod vocabulary (every type is a z.infer), the streaming corpus reader, SHA-256 dedupe, content-addressed append-only JSONL stores, PII redaction, the budget guard, run directories, the model registry. | contracts (vocabulary), etl (the deterministic layer) |
| L-1 · SOURCE | The ingestion layer that feeds the wiki the ETL later distills — a read-only Collector + an agent-browser wrapper that can only navigate, never mutate. | ingestion |
Glue that sits across the layers. A handful of packages orchestrate across L2–L4 and are best read as their own tier: @alembic/mission (compiles missions → run specs), @alembic/vm (executes alembic.plan.ts by injecting the h.* hooks), @alembic/coda (the four run-closing gates), @alembic/forge (the Forge front-end + Scope Gate), and @alembic/planf3 (plan HTML).
Here is the single most important idea in the codebase. Every model invocation in the entire system flows through one function shape: an async call that never throws and returns a discriminated union keyed on ok. Success and failure are both ordinary return values — there is no second, exceptional path to reason about.
// packages/contracts/src/model.ts — the waist (shape) interface ModelAdapter { run(input: ModelRunInput): Promise<ModelRunResult>; // NEVER throws (the invariant) } // ModelRunResult is a discriminated union on `ok`: ModelRunSuccess = { ok: true; text; usage?; costUsd?; durationMs; modelId; ... } ModelRunFailure = { ok: false; error: { code; message; retryable }; durationMs; ... } ModelRunResult = z.discriminatedUnion('ok', [ Success, Failure ])
Source: packages/contracts/src/model.ts:30-151. Governed by ADR-0009 ("narrow waist — run never throws").
"Never throws" is not a comment you hope holds — it is structurally enforced by a single reusable spine, runWithGuards. Each adapter implements only an inner attempt(); the spine wraps it, in order, with: (1) Zod validation of the input at the boundary, (2) a try/catch that converts any escaped throw into a typed ModelRunFailure, (3) an optional circuit-breaker gate, and (4) retry backoff driven by each result's retryable flag — with a final outer try/catch "as a last safety net to preserve the invariant."
Because the shape is uniform, every layer above L1 can be written as a pure kernel that just branches on ok. A 429, a timeout, a malformed response, a provider outage — all arrive as the same typed failure. There is no try/catch scattered through the engine, the swarm, or the funnel. The orchestration core even re-establishes the boundary for whole sub-runs with runDebateSafe / runSwarmSafe. One contract, enforced once, bought everywhere.
A second, lighter Result<T, E> union (also keyed on ok, with value/error arms) exists for non-model fallible work — file IO, parsing, subprocess wrapping. It deliberately mirrors the model waist so both read identically at call sites. That discipline is Lesson 5.
The architecture rests on four properties (not more — the map enumerates exactly these). Each is asserted in source, and most are governed by an ADR.
| # | Invariant | How it is held |
|---|---|---|
| 1 | run() never throws; the result is a uniform discriminated union. | Structural, via runWithGuards (adapters/src/adapter-core.ts:118). ADR-0009. |
| 2 | Engines are adapter-agnostic AND store-agnostic — pure kernels with injected side-effects. | The DebateEngine takes readonly views + an injected AdapterRegistry; the ETL routes all IO through an injectable FsPort; the funnel takes an injected registry (so an offline one makes the run $0). |
| 3 | Content-addressed IDs + deterministic run-dir layout (so runs replay). | A run's id is the SHA-256 content hash of its spec; stores are content-addressed over canonical JSON, so re-appending identical content is a no-op. Plan modules may not use Date.now()/Math.random(). |
| 4 | Dissent is preserved/forced by the Verifier, not merely by a prompt. | The maker-checker Verifier is read-only by architecture and proves claims with deterministic oracles over structured evidence, never the maker's prose. "Contrarian-last" is a hard board-load error. ADR-0003. |
Alembic's reason for existing is the funnel: it turns a raw corpus into two value-chains — a business opportunity graph and a Learnings store. It does this in four cost tiers, cheap-first, so that most of the work costs nothing and only the strongest signals ever reach a paid model.
| Tier | What it does | Cost |
|---|---|---|
| T0 | Deterministic runT0Pipeline: walk the corpus → SHA-256 dedupe → contract-validate → 6-dim score → emit residue. Runs over 100% of the corpus. | $0 |
| T1 | runT1Extraction: one BusinessSignal per residue item via the injected LOCAL adapter (free-tier, so never budget-blocked in practice). | ~$0 |
| T2 | runT2Shortlist: a budget-gated FRONTIER shortlist refines the strongest T1 signals in batches; every paid call is metered. | metered |
| T3 | runT3Council: a synthetic 3-member council (optimist / analyst / pessimist) plus the N-lens verifier panel. | metered |
A T3 outcome only emits when both the consensus decision is GO and isPanelEmissionApproved(report) holds — the N-lens panel verified, not parked. A simple majority is not enough; the panel can veto. This is what keeps the opportunity graph honest: nothing sediments without clearing the emission gate.
Three safety invariants the funnel must never regress: PII before egress (a private-channel signal is redacted before the model call and re-checked before any write), budget fail-closed (every paid call is wrapped in a fail-closed BudgetGuard.check — a projected breach blocks the call and the tier degrades rather than overspending), and append-only (results flow to content-addressed, schema-validated, append-only stores; source reads stay read-only).
runWithGuards), so the failure is converted to a typed value once and every consumer reads it uniformly. The win is the absence of error-handling everywhere else, not the presence of it in the adapter.Date.now() / Math.random(); the VM rejects them.run() never throws; a 429 surfaces as a ModelRunFailure with error.retryable set. runWithGuards converts any escaped throw into this shape and even drives retry from the flag. And routing has no silent fallback — a missing model returns a typed error, never a substitute.GO consensus and panel approval.