Lesson 10 · Deep dive · subsystem 4 of 7

ClarifyGateway — the T4 human-gate

When the agent needs a human decision before proceeding, it raises a structured pause: a question (multiple-choice or open-ended), and it blocks on the answer. In Alembic terms this is the T4 human-gate surface (ADR-0005). Python blocks a thread on a threading.Event; Node has no blocking thread, so the faithful equivalent is a promise + a resolver registry + a timeout. A CLONE of Hermes' clarify_tool.py + clarify_gateway.py.

The mechanism: ask registers, resolve settles

// packages/hermes/src/clarify/gateway.ts:82-110 — ask (condensed)
const parsed = clarifyQuestionSchema.safeParse(question);
if (!parsed.success) return err(new Error(`Invalid clarify question: …`)); // fails closed SYNCHRONOUSLY
const id = this.mintId();
return new Promise((resolvePromise) => {
  const timer = setTimeout(() => {
    this.entries.delete(id);                       // drop the entry…
    resolvePromise(err(new Error('clarify timed out')));  // …never hang
  }, timeoutMs);
  timer.unref?.();                                  // let the process exit while pending
  const settle = (result) => { clearTimeout(timer); this.entries.delete(id); resolvePromise(result); };
  this.entries.set(id, { id, question: valid, settle, timer });
});

Two details that matter. (1) An invalid question fails closed synchronously — before any entry is registered, so a malformed question never leaves a dangling pending request (test: "returns err for a >MAX_CHOICES question and registers nothing"). (2) timer.unref() lets the Node process exit even while a clarify is pending — a hanging human prompt won't keep the runtime alive forever.

The data contract: choice or open, capped at 4

A question is a discriminated union on kind. A choice question carries 1..MAX_CHOICES options; MAX_CHOICES = 4 is the data cap (the UI's 5th "Other" option is a presentation concern, not modelled here):

// packages/hermes/src/clarify/types.ts:48-61
export const clarifyQuestionSchema = z.discriminatedUnion('kind', [
  z.object({
    kind: z.literal('choice'),
    prompt: z.string().min(1, 'prompt cannot be empty'),
    choices: z.array(z.string().min(1, …))
      .min(1, 'choice question needs at least one choice')
      .max(MAX_CHOICES, `choice question allows at most ${MAX_CHOICES} choices`),
  }),
  z.object({ kind: z.literal('open'), prompt: z.string().min(1, …) }),
]);

A subtle robustness helper rides along: coerceChoices, a clone of the source's _flatten_choice. LLMs sometimes emit dict-shaped choices ([{description:'…'}]) instead of bare strings; this unwraps them by canonical label keys in priority order:

// packages/hermes/src/clarify/types.ts:104-121 — flattenChoice
const CHOICE_LABEL_KEYS = ['label', 'description', 'text', 'title'] as const; // name/value EXCLUDED
// string ⇒ trimmed; dict ⇒ first non-empty canonical key; else ⇒ '' (dropped)

name/value are deliberately excluded — they carry raw enum values/identifiers, not human labels, and a garbage label is worse than no choice at all (the dict collapses to '' and is dropped). Note coerceChoices does not cap to 4 — capping is the schema's job, so an over-long list fails closed at validation rather than being silently truncated.

Cross-field validation: the answer must fit the question

Zod can check a response's shape, but not whether it fits the live question. resolve enforces the cross-field invariants — kind match and index-in-range — that the schema cannot:

// packages/hermes/src/clarify/gateway.ts:143-168 — validateResponse
if (value.kind !== question.kind)
  return err(new Error(`Response kind '${value.kind}' does not match question kind '${question.kind}'.`));
if (value.kind === 'choice' && question.kind === 'choice') {
  if (value.index >= question.choices.length)
    return err(new Error(`Choice index ${value.index} out of range …`));
}

The "leave it pending" rule — a thoughtful failure mode

When resolve gets an invalid response (wrong kind, out-of-range index), it returns err but leaves the entry pending — so a corrected response can still arrive and settle the same promise. Only a valid response settles (and removes) the entry. The test proves it: a kind-mismatch resolve fails yet pending() still lists the id; a follow-up correct resolve then settles it. By contrast, an unknown or already-settled id is a plain err ("Unknown or already-resolved"), and a double-resolve fails because the first one already removed the entry.

Determinism: monotonic ids, no `Math.random()`

Ids come from an injectable factory, defaulting to a monotonic counter — never Math.random()/Date.now(), which the engine's plan VM rejects and which would break replay:

// packages/hermes/src/clarify/gateway.ts:176-182
export const monotonicIdFactory = (prefix = 'clarify'): (() => ClarifyId) => {
  let n = 0;
  return () => { n += 1; return `${prefix}-${n}`; };
};

The default timeout is DEFAULT_CLARIFY_TIMEOUT_MS = 600_000 (10 minutes), mirroring the source's 600s. Tests drive it with vitest fake timers, advancing past the deadline to prove the entry is dropped and the promise resolves to err — no leak, no hang.

1. resolve is called with an open-text answer for a choice question. What happens?

Correct: c. validateResponse rejects a kind mismatch with err, but only a valid response settles the entry — an invalid one leaves it pending for a re-prompt. The test confirms pending() still lists the id afterward.

2. A clarify question arrives with 5 choices. When is it rejected?

Correct: b. Capping is the schema's job, not coerceChoices'. ask validates the question first and returns err synchronously, registering nothing — failing closed rather than silently truncating untrusted input.

3. Why does ClarifyGateway mint ids via an injected monotonicIdFactory instead of a random id?

Correct: d. Same discipline as the curator's injected Clock: replace a non-deterministic global with an injected seam. Tests inject monotonicIdFactory('q') so they can assert on q-1, q-2.

Common confusions

"It uses real threads to block." No — Node is single-threaded. The "block" is a returned Promise the caller awaits; a resolver registry (Map<id, pending>) lets a platform callback settle it by id, and a setTimeout guarantees it never hangs. This is the faithful equivalent of Python's threading.Event in an async runtime.

"An invalid answer cancels the question." No — an invalid answer is rejected with err while the question stays pending for a corrected response. Only a valid answer (or the timeout) removes the entry.

← Lesson 9 Lesson 11 →

Sources (all in the repo, read verbatim):
· packages/hermes/src/clarify/gateway.ts — ask sync-fail + timer.unref + resolver registry (82–110), resolve leave-pending rule (119–129), validateResponse cross-field (143–168), monotonicIdFactory (176–182), DEFAULT_CLARIFY_TIMEOUT_MS = 600_000 (43).
· packages/hermes/src/clarify/types.ts — MAX_CHOICES = 4 (39), clarifyQuestionSchema/clarifyResponseSchema discriminated unions (48–78), PendingClarify (92–99), coerceChoices/flattenChoice + excluded name/value (104–131).
· packages/hermes/src/clarify/clarify.test.ts — 19 cases incl. sync-fail-registers-nothing (113–121), kind-mismatch-leaves-pending (124–135), out-of-range (137–144), double-resolve (161–169), timeout drops entry (177–190), default-timeout boundary (205–218).
· CLONE provenance: docs/hermes-complete-map.md §3.7; ADR-0005 (the human gate at ship / irreversible spend). ← Course hub · Português

ClarifyGateway — the T4 human-gate

The mechanism: ask registers, resolve settles

The data contract: choice or open, capped at 4

Cross-field validation: the answer must fit the question

Determinism: monotonic ids, no Math.random()

Common confusions

Determinism: monotonic ids, no `Math.random()`