Lesson 08 · Deep dive · subsystem 2 of 7

reviewAndLearn — the gated learning pass

How a finished turn teaches the next one. A reviewer proposes durable memory writes; a Validator gate disposes; approved writes sediment into the MemoryStore. The keystone constraint — drawn straight from the fusion matrix and ADR-0018 — is that learning is gated, not auto-apply. This is an ADAPT of Hermes' background-review fork into the engine's ports style: no daemon thread, just three injected ports.

The shape: propose → gate → apply

The driver is small and total. An empty summary or zero proposals short-circuits to an empty outcome (a no-op pass is valid); a proposer or gate error fails the whole pass closed:

// packages/hermes/src/learning/review.ts:54-70
export const reviewAndLearn = async (
  summary: string, deps: ReviewDeps,
): Promise<Result<LearnOutcome, Error>> => {
  if (summary.trim().length === 0) return ok(emptyOutcome());
  const proposed = await deps.proposer(summary);
  if (!proposed.ok) return proposed;                // proposer error ⇒ fail closed
  if (proposed.value.length === 0) return ok(emptyOutcome());
  const acc: OutcomeAcc = { applied: [], rejected: [], failed: [] };
  for (const raw of proposed.value) {
    const stepErr = await processOne(raw, deps, acc);
    if (stepErr) return stepErr;                    // gate error / bad shape ⇒ fail closed
  }
  return ok({ applied: acc.applied, rejected: acc.rejected, failed: acc.failed });
};

The three buckets — and why `failed` ≠ `rejected`

The outcome separates three fates. The distinction between rejected (the gate said no) and failed (the gate said yes but the store couldn't write) is load-bearing for observability — an over-budget write must never be mistaken for a policy refusal:

// packages/hermes/src/learning/review.ts:78-103 — processOne
const parsed = reviewProposalSchema.safeParse(raw);   // untrusted model output
if (!parsed.success) return err(new Error(`Invalid review proposal: …`));
const proposal = parsed.data;
const verdict = await deps.gate(proposal);
if (!verdict.ok) return verdict;                      // gate ERROR ⇒ fail the pass
if (!verdict.value.approved) {
  acc.rejected.push({ proposal, reason: verdict.value.reason }); // gate said NO
  return undefined;                                 // continue the pass
}
const written = await deps.memory.apply(proposal.target, proposal.op);
if (!written.ok) {
  acc.failed.push({ proposal, reason: written.error.message }); // store said NO
  return undefined;                                 // still continue
}
acc.applied.push(proposal);

Read the return type. processOne returns Result<never, Error> | undefined. undefined means "this proposal is handled — continue the pass." An Err means "stop the whole pass closed." So a single proposal being rejected or store-failed does not abort the batch; only an infrastructure error (bad proposal shape, gate failure) does. The test "records a store-rejected approved write in failed" proves the first write survives while the overflow lands in failed.

The default gate: learn only from validated wins

Until the real @alembic/coda Validator Gate (ADR-0006) wires in its own ReviewGate, the conservative default approves a proposal iff score ≥ 0.7 — the mechanical encoding of "learn only from validated wins" carried from the hermes-mini-loop:

// packages/hermes/src/learning/gate.ts:24-36
export const scoreThresholdGate = (
  min: number = DEFAULT_REVIEW_SCORE_THRESHOLD,   // 0.7
): ReviewGate => {
  return async (proposal) => {
    const approved = proposal.score >= min;        // boundary is INCLUSIVE
    const reason = approved
      ? `score ${proposal.score} ≥ threshold ${min}`
      : `score ${proposal.score} < threshold ${min} (learn only from validated wins)`;
    return ok({ approved, reason });               // pure & total: always ok(verdict)
  };
};

A subtlety worth internalizing — the verdict is data, not the Result

The gate returns ok({approved:false, …}) for a rejection — not err. The Result wrapper signals whether the gate functioned; verdict.approved carries the decision. A gate that errored (e.g. the Validator service is down) returns err and stops the pass. This separation is exactly why reviewAndLearn can tell "the policy declined this" apart from "the policy machinery broke." The test "rejects 0.69 and approves 0.70 at the default 0.7 floor" pins the inclusive boundary.

Reinforce, don't duplicate

The loop adds no dedup logic of its own. Approved writes flow through the MemoryStore's existing dedup — re-proposing an entry that's already there is a no-op success (so it's counted as applied, but the store stays at one entry). That mirrors the mini-loop's ON CONFLICT DO UPDATE intent: reinforce, don't pile up duplicates. The proposal schema (learning/types.ts:62-71) validates the model's output at the boundary, including score bounded to [0,1] — a score: 1.5 from a misbehaving model is rejected before any write.

1. The gate returns ok({approved:false, reason:'too weak'}) for a proposal. Where does it land?

Correct: c. A gate refusal is ok({approved:false}) → rejected[]. failed[] is reserved for proposals the gate approved but the store couldn't write. And only a gate error (err) would abort the pass.

2. A proposal scores 0.69 against the default scoreThresholdGate(). The result?

Correct: b. approved = score >= min with min = 0.7. 0.69 is below the inclusive floor, so it's rejected with the "learn only from validated wins" reason. 0.70 exactly would approve.

3. Why does the learning loop reuse the MemoryStore's dedup instead of adding its own?

Correct: d. Routing approved writes through the store's existing dedup keeps a single dedup policy. Re-proposing an existing entry succeeds (counted applied) but doesn't grow the store — proven by the "reinforce, do not duplicate" test.

Common confusions

"It auto-applies what the model proposes." The opposite — it is Validator-gated by design (ADR-0018). The model only proposes; a gate decides. The default gate is conservative (score ≥ 0.7), and the real coda Validator can replace it by injection without touching this kernel.

"No daemon means it's a different mechanism." It's an ADAPT, not a literal port: Hermes forks a background thread; Alembic has no Python AIAgent runtime, so the same propose→gate→apply shape is reshaped into injected ports (ReviewProposer, ReviewGate, MemoryStore). The discipline is identical; the plumbing fits the engine.

← Lesson 7 Lesson 9 →

Sources (all in the repo, read verbatim):
· packages/hermes/src/learning/review.ts — reviewAndLearn (54–70), processOne with the three buckets (78–103), emptyOutcome (106).
· packages/hermes/src/learning/gate.ts — scoreThresholdGate inclusive floor (24–36).
· packages/hermes/src/learning/types.ts — DEFAULT_REVIEW_SCORE_THRESHOLD = 0.7 (41), reviewProposalSchema with score∈[0,1] (62–71), LearnOutcome applied/rejected/failed (99–106), ReviewProposer/ReviewGate ports (118–130).
· packages/hermes/src/learning/review.test.ts — 14 cases incl. 0.69/0.70 boundary (181–202), failed-vs-rejected (321–341), reinforce-not-duplicate (218–246), fail-closed on proposer/gate errors (280–319).
· CLONE/ADAPT provenance: docs/alembic-hermes-fusion-matrix.md §3 (the keystone), docs/hermes-complete-map.md §1.10/§5.1, ADR-0018 (gated, not auto-apply); ADR-0006 (Validator Gate). ← Course hub · Português

reviewAndLearn — the gated learning pass

The shape: propose → gate → apply

The three buckets — and why failed ≠ rejected

The default gate: learn only from validated wins

Reinforce, don't duplicate

Common confusions

The three buckets — and why `failed` ≠ `rejected`