In Lab 1 you built a store. Now you wire the keystone of the fusion (Lessons 4 and 8): a closed self-improvement pass where a reviewer proposes durable writes, a gate disposes, and approved writes sediment into the store. You will assemble reviewAndLearn from three injected ports — a fake proposer, a fake gate, and the real MemoryStore from Lab 1's family — and watch a turn split into the applied / rejected / failed buckets. This is the real shipped API; you are calling it exactly as production does.
The driver reviewAndLearn(summary, deps) depends on injected ports only — no concrete adapter, no concrete store construction (ADR-0009). Its ReviewDeps are three fields (learning/review.ts:30-37):
| Port | Type | In production | In this lab |
|---|---|---|---|
| proposer | ReviewProposer | one ModelAdapter call, narrow-waist-shaped | a fake returning fixed proposals |
| gate | ReviewGate | the @alembic/coda Validator | scoreThresholdGate(0.7) (shipped default) |
| memory | MemoryStore | the durable file-backed store | a real MemoryStore over a fake FsPort |
// setup
The real MemoryStore from @alembic/hermes needs an FsPort. We reuse the same Map-backed fake from Lab 1, then load() it once so its state is initialized.
import { MemoryStore, reviewAndLearn, scoreThresholdGate } from '@alembic/hermes'; import { ok, type Result } from '@alembic/contracts'; import type { ReviewProposal, ReviewProposer } from '@alembic/hermes'; const memory = new MemoryStore(makeFakeFs(), '/agent'); // fake FsPort from Lab 1 await memory.load(); // initialize live state
In production the proposer wraps one model call and translates its ModelRunResult into a Result. In the lab it just returns fixed proposals. Each proposal is a { target, op, rationale, score } — the score ∈ [0,1] is the reviewer's own confidence, and the gate decides if it clears the floor.
const fakeProposer: ReviewProposer = async (_summary) => { const proposals: ReviewProposal[] = [ { target: 'memory', op: { action: 'add', content: 'Build runs offline by default' }, rationale: 'observed this run', score: 0.9 }, // strong → should APPLY { target: 'memory', op: { action: 'add', content: 'maybe prefer tabs?' }, rationale: 'hunch', score: 0.4 }, // weak → should REJECT ]; return ok(proposals); };
ok(...) and not just the array? The proposer port returns Result<readonly ReviewProposal[], Error>, never throws. A model failure in production becomes err(...), which fails the whole pass closed. Wrapping the array in ok says "the reviewer ran successfully and here is its output."// the call
Now assemble the three ports and run one pass. The default gate approves score ≥ 0.7, so the 0.9 proposal applies and the 0.4 proposal is rejected.
const result = await reviewAndLearn('finished a unit; tests green', { proposer: fakeProposer, gate: scoreThresholdGate(0.7), // the shipped conservative default memory, }); if (result.ok) { console.log(result.value.applied.length); // 1 (the 0.9 proposal) console.log(result.value.rejected.length); // 1 (the 0.4 proposal) console.log(result.value.failed.length); // 0 console.log(memory.entries('memory')); // ['Build runs offline by default'] }
The store now holds exactly the approved write. The rejected proposal carries the gate's reason — "score 0.4 < threshold 0.7 (learn only from validated wins)" — so nothing is silently dropped.
failed bucket (gate yes, store no)The subtle distinction: rejected = the gate said no; failed = the gate said yes but the store couldn't write (e.g. over its char budget). To see it, shrink the store's budget so an approved write overflows. The pass still succeeds overall — only that one write lands in failed:
const tiny = new MemoryStore(makeFakeFs(), '/agent', { memoryCharLimit: 10 }); await tiny.load(); const r = await reviewAndLearn('turn', { proposer: async () => ok([{ target: 'memory', op: { action: 'add', content: 'a sentence far longer than ten chars' }, rationale: 'x', score: 0.95 }]), // gate APPROVES (0.95 ≥ 0.7)… gate: scoreThresholdGate(), memory: tiny, }); // …but the store rejects the over-budget write: // r.value.applied = [] r.value.rejected = [] r.value.failed = [{proposal, reason:'…exceed the limit…'}]
Collapsing failed into rejected would tell an operator "the policy declined this write" when the truth is "the policy approved it but the store is full." Those demand different fixes — relax the gate vs. consolidate memory. The shipped test "records a store-rejected approved write in failed" (review.test.ts) pins exactly this separation.
Two things abort the whole pass with err (not a bucket): a proposer error and a gate error. Try a proposer that returns err:
const bad = await reviewAndLearn('turn', { proposer: async () => err(new Error('model timed out')), gate: scoreThresholdGate(), memory, }); // bad.ok === false — the whole pass failed closed; the store is untouched.
And an empty summary or zero proposals is a valid no-op — ok with three empty buckets, mirroring the source's "Nothing to save." (review.ts:58, 62).
MemoryStore is over its char budget and rejects the write. Where does it land?rejected is a gate refusal; failed is "gate yes, store no." An over-budget approved write is recorded in failed with the store's reason, never thrown and never confused with a policy refusal (review.ts:96-100).scoreThresholdGate(0.7). Later the team ships the real coda Validator. How much of reviewAndLearn changes?@alembic/coda Validator Gate wires in later by supplying its own ReviewGate — no change to reviewAndLearn." That is the payoff of depending on ports, not concretions.err(new Error('model timed out')). What happens to the pass and the store?reviewAndLearn checks if (!proposed.ok) return proposed (review.ts:61) — a proposer error short-circuits the whole pass before any gate or write. Fail-closed is the rule: no proposals means no learning, not partial learning.Run the pass twice with the same high-score proposal (e.g. content 'Build runs offline by default', score 0.9) against the same MemoryStore instance. Then assert:
applied[] (the gate approves it each time).memory.entries('memory') has length 1, not 2 — the store's own dedup makes the second write a no-op success.This is the "reinforce, don't duplicate" property (Lesson 8): the loop adds no dedup of its own — approved writes flow through the store's existing dedup, mirroring the mini-loop's ON CONFLICT DO UPDATE. The real test that proves it is "reinforce, do not duplicate" in review.test.ts.
op.content contains a banned word (a crude "no secrets in memory" policy), returning ok({approved:false, reason:'contains banned term'}). Confirm the banned proposal lands in rejected[] with your reason, and a clean one still applies. You've just shown the gate is the place to enforce any emission policy — the Validator is only the richest example.