Course / Lesson 23  ·  PT-BR
Lesson 23 · Lab · hands-on 2 of 2

Lab: wire a Validator-gated learning pass

In Lab 1 you built a store. Now you wire the keystone of the fusion (Lessons 4 and 8): a closed self-improvement pass where a reviewer proposes durable writes, a gate disposes, and approved writes sediment into the store. You will assemble reviewAndLearn from three injected ports — a fake proposer, a fake gate, and the real MemoryStore from Lab 1's family — and watch a turn split into the applied / rejected / failed buckets. This is the real shipped API; you are calling it exactly as production does.

The whole point of this lab. Learning in Alembic is gated, not auto-apply (ADR-0018). The model proposes; a Validator disposes. By building the pass from fakes you'll see precisely why a rejected proposal differs from a failed write — and why both differ from an error that aborts the pass.

The three ports you wire together

The driver reviewAndLearn(summary, deps) depends on injected ports only — no concrete adapter, no concrete store construction (ADR-0009). Its ReviewDeps are three fields (learning/review.ts:30-37):

PortTypeIn productionIn this lab
proposerReviewProposerone ModelAdapter call, narrow-waist-shapeda fake returning fixed proposals
gateReviewGatethe @alembic/coda ValidatorscoreThresholdGate(0.7) (shipped default)
memoryMemoryStorethe durable file-backed storea real MemoryStore over a fake FsPort
proposer summary → proposals gate score ≥ 0.7 ? memory.apply dedup reused applied[] rejected[] — gate: no failed[] — store: no

Step 1 — build the store (reuse Lab 1's family)

// setup

The real MemoryStore from @alembic/hermes needs an FsPort. We reuse the same Map-backed fake from Lab 1, then load() it once so its state is initialized.

import { MemoryStore, reviewAndLearn, scoreThresholdGate } from '@alembic/hermes';
import { ok, type Result } from '@alembic/contracts';
import type { ReviewProposal, ReviewProposer } from '@alembic/hermes';

const memory = new MemoryStore(makeFakeFs(), '/agent');  // fake FsPort from Lab 1
await memory.load();                                      // initialize live state

Step 2 — write a fake proposer

In production the proposer wraps one model call and translates its ModelRunResult into a Result. In the lab it just returns fixed proposals. Each proposal is a { target, op, rationale, score } — the score ∈ [0,1] is the reviewer's own confidence, and the gate decides if it clears the floor.

const fakeProposer: ReviewProposer = async (_summary) => {
  const proposals: ReviewProposal[] = [
    { target: 'memory', op: { action: 'add', content: 'Build runs offline by default' },
      rationale: 'observed this run', score: 0.9 },   // strong → should APPLY
    { target: 'memory', op: { action: 'add', content: 'maybe prefer tabs?' },
      rationale: 'hunch', score: 0.4 },                // weak → should REJECT
  ];
  return ok(proposals);
};
Why ok(...) and not just the array? The proposer port returns Result<readonly ReviewProposal[], Error>, never throws. A model failure in production becomes err(...), which fails the whole pass closed. Wrapping the array in ok says "the reviewer ran successfully and here is its output."

Step 3 — run the pass and read the buckets

// the call

Now assemble the three ports and run one pass. The default gate approves score ≥ 0.7, so the 0.9 proposal applies and the 0.4 proposal is rejected.

const result = await reviewAndLearn('finished a unit; tests green', {
  proposer: fakeProposer,
  gate: scoreThresholdGate(0.7),   // the shipped conservative default
  memory,
});

if (result.ok) {
  console.log(result.value.applied.length);   // 1  (the 0.9 proposal)
  console.log(result.value.rejected.length);  // 1  (the 0.4 proposal)
  console.log(result.value.failed.length);    // 0
  console.log(memory.entries('memory'));     // ['Build runs offline by default']
}

The store now holds exactly the approved write. The rejected proposal carries the gate's reason — "score 0.4 < threshold 0.7 (learn only from validated wins)" — so nothing is silently dropped.

Step 4 — force a failed bucket (gate yes, store no)

The subtle distinction: rejected = the gate said no; failed = the gate said yes but the store couldn't write (e.g. over its char budget). To see it, shrink the store's budget so an approved write overflows. The pass still succeeds overall — only that one write lands in failed:

const tiny = new MemoryStore(makeFakeFs(), '/agent', { memoryCharLimit: 10 });
await tiny.load();

const r = await reviewAndLearn('turn', {
  proposer: async () => ok([{ target: 'memory',
    op: { action: 'add', content: 'a sentence far longer than ten chars' },
    rationale: 'x', score: 0.95 }]),   // gate APPROVES (0.95 ≥ 0.7)…
  gate: scoreThresholdGate(),
  memory: tiny,
});
// …but the store rejects the over-budget write:
// r.value.applied = []   r.value.rejected = []   r.value.failed = [{proposal, reason:'…exceed the limit…'}]
Why three buckets, not two

Collapsing failed into rejected would tell an operator "the policy declined this write" when the truth is "the policy approved it but the store is full." Those demand different fixes — relax the gate vs. consolidate memory. The shipped test "records a store-rejected approved write in failed" (review.test.ts) pins exactly this separation.

Step 5 — the fail-closed paths

Two things abort the whole pass with err (not a bucket): a proposer error and a gate error. Try a proposer that returns err:

const bad = await reviewAndLearn('turn', {
  proposer: async () => err(new Error('model timed out')),
  gate: scoreThresholdGate(),
  memory,
});
// bad.ok === false — the whole pass failed closed; the store is untouched.

And an empty summary or zero proposals is a valid no-opok with three empty buckets, mirroring the source's "Nothing to save." (review.ts:58, 62).

1. A proposal scores 0.95, the gate approves it, but the MemoryStore is over its char budget and rejects the write. Where does it land?
Correct: c. rejected is a gate refusal; failed is "gate yes, store no." An over-budget approved write is recorded in failed with the store's reason, never thrown and never confused with a policy refusal (review.ts:96-100).
2. You inject scoreThresholdGate(0.7). Later the team ships the real coda Validator. How much of reviewAndLearn changes?
Correct: b. The gate is a port. The conservative default is opt-in by injection, and "the real @alembic/coda Validator Gate wires in later by supplying its own ReviewGate — no change to reviewAndLearn." That is the payoff of depending on ports, not concretions.
3. Your fake proposer returns err(new Error('model timed out')). What happens to the pass and the store?
Correct: d. reviewAndLearn checks if (!proposed.ok) return proposed (review.ts:61) — a proposer error short-circuits the whole pass before any gate or write. Fail-closed is the rule: no proposals means no learning, not partial learning.

Your turn — extend the pass

Exercise: a "reinforce, don't duplicate" assertion

Run the pass twice with the same high-score proposal (e.g. content 'Build runs offline by default', score 0.9) against the same MemoryStore instance. Then assert:

This is the "reinforce, don't duplicate" property (Lesson 8): the loop adds no dedup of its own — approved writes flow through the store's existing dedup, mirroring the mini-loop's ON CONFLICT DO UPDATE. The real test that proves it is "reinforce, do not duplicate" in review.test.ts.

Stretch goal. Write a gate that rejects any proposal whose op.content contains a banned word (a crude "no secrets in memory" policy), returning ok({approved:false, reason:'contains banned term'}). Confirm the banned proposal lands in rejected[] with your reason, and a clean one still applies. You've just shown the gate is the place to enforce any emission policy — the Validator is only the richest example.