Course / Lesson 7  ·  PT-BR
Lesson 07 · Deep dive · subsystem 1 of 7

The frozen-snapshot MemoryStore

A bounded, file-backed memory the agent carries across sessions — two stores (MEMORY.md for the agent's own notes, USER.md for what it knows about you), edited by a single memory operation that locates entries by a short unique substring. Its defining trick is the frozen snapshot: what the system prompt sees is captured once at load and never moves, even as mid-session writes hit disk. This is a faithful CLONE of Hermes' tools/memory_tool.py (1089 LOC).

The public API

The whole subsystem is one class and a few constants, all named exports from packages/hermes/src/index.ts:

ExportRole
MemoryStorethe class — load() / renderSnapshot() / entries() / apply()
ENTRY_DELIMITER'\n§\n' — the section-sign delimiter between entries
DEFAULT_MEMORY_CHAR_LIMIT2200 — char cap for MEMORY.md
DEFAULT_USER_CHAR_LIMIT1375 — char cap for USER.md
MemoryOpthe discriminated op union: add / replace / remove
MemoryOpOutcometerminal payload: target, message, entryCount, usedChars, charLimit

The operation itself is a compile-time discriminated union — there is no string-typed "params" bag:

// packages/hermes/src/memory/memory-store.ts:85-88
export type MemoryOp =
  | { readonly action: 'add'; readonly content: string }
  | { readonly action: 'replace'; readonly oldText: string; readonly content: string }
  | { readonly action: 'remove'; readonly oldText: string };

Only the two values that cross an untyped boundary in the Python source — target and action — are Zod-validated at runtime (memory/schema.ts: z.enum(['memory','user']) and z.enum(['add','replace','remove'])). The op payload is a compile-time union, so an illegal shape can't even be written.

The defining invariant: snapshot vs live state

The store holds two parallel realities. load() reads disk, dedupes, and freezes a snapshot; apply() mutates live entries and persists them — but never touches the snapshot:

session start load() read · dedupe · freeze snapshot (renderSnapshot) — embedded in the system prompt, NEVER mutated mid-session stable for the whole session ⇒ the prompt-prefix cache holds apply() #1 apply() #2 → disk (durable now) live entries advance with each apply(); the frozen snapshot above does not — they reconcile only on the NEXT load()
// packages/hermes/src/memory/memory-store.ts:119-147 (condensed)
async load(): Promise<Result<void, Error>> {
  // … read MEMORY.md + USER.md via FsPort …
  this.memoryEntries = dedupe(memory.value);   // Python dict.fromkeys parity
  this.userEntries   = dedupe(user.value);
  this.snapshot = {                            // frozen ONCE, here
    memory: this.renderBlock('memory', this.memoryEntries),
    user:   this.renderBlock('user',   this.userEntries),
  };
  return ok(undefined);
}
renderSnapshot(target: MemoryTarget): string | undefined {
  const block = this.snapshot[target];   // load-time state, not live
  return block.length > 0 ? block : undefined;
}
Why freeze? Frontier models cache the prompt prefix. If the memory block embedded in the system prompt changed every time the agent wrote a note, the prefix would change and the cache would be invalidated for the rest of the session — slower and costlier. So writes are durable immediately (disk) but the in-prompt snapshot stays put until the next session's load(). The test "snapshot reflects load-time disk state, not mid-session writes" proves it: after an apply(), renderSnapshot() is byte-identical to before, yet disk already contains the new entry.

Locate by unique substring — no IDs

replace and remove don't take an index or an id. They take a short substring and the store finds the one entry that contains it. The matcher fails closed on ambiguity:

// packages/hermes/src/memory/memory-store.ts:369-392 — locateUnique
const matches = entries.flatMap((entry, index) =>
  entry.includes(needle) ? [{ index, entry }] : [],
);
if (matches.length === 0) return err(new Error(`No entry matched '${needle}'.`));
if (matches.length > 1) {
  const distinct = new Set(matches.map((m) => m.entry));
  if (distinct.size > 1) {
    return err(new Error(`Multiple entries matched '${needle}'. Be more specific.`));
  }
  // All identical — safe to operate on the first.
}

The subtlety: multiple matches are only an error when they are distinct entries. If every match is the exact same text (true duplicates), acting on the first is safe — the source's fidelity rule. Duplicates can't arrive via add() (it dedupes), so the test seeds them on disk and loads to exercise that branch.

The char budget — and a deliberate asymmetry

Limits are counted in characters, not tokens, because char counts are model-independent. Crucially, the ENTRY_DELIMITER counts against the budget — the store measures the joined length, exactly what lands on disk:

// packages/hermes/src/memory/memory-store.ts:344-346
const joinedLength = (entries: readonly string[]): number =>
  entries.length === 0 ? 0 : entries.join(ENTRY_DELIMITER).length;

The test "counts the delimiter against the budget" pins it: 'aaa' + '\n§\n' (3 chars) + 'bbb' = 9 chars, so a limit of 9 passes and 8 fails. When an add would overflow, the error isn't a bare failure — it tells the agent to consolidate ("use 'replace' to merge … or 'remove' stale entries, then retry"), turning a hard cap into a curation prompt.

The read/write asymmetry — read more

A corrupt or missing file on read returns ok([]) (an empty store) — a telemetry-like hot path must never break the host. But a failed write returns err — the caller explicitly chose to persist, so a silent loss would be a lie. The same asymmetry recurs in the curator's usage store; it is a deliberate, repeated design stance, not an accident.

1. An agent calls apply('memory', {action:'add', …}) mid-session. What does renderSnapshot('memory') return immediately after?
Correct: b. The snapshot is captured once in load() and never mutated mid-session, preserving the prompt-prefix cache. The write IS durable on disk and visible via entries(), but the in-prompt snapshot only refreshes on the next session's load().
2. replace is called with oldText: 'task:' and two distinct entries contain 'task:'. What happens?
Correct: c. locateUnique returns err when matches are distinct, and the store leaves every entry untouched. (If the matches were exact duplicates of one another, acting on the first is the source's fidelity rule — but distinct matches always fail closed.)
3. Why are the limits measured in characters rather than tokens, and why include the delimiter?
Correct: d. Tokenization differs per model; characters are stable. And since entries are persisted joined by '\n§\n', the budget includes those 3 chars per gap — the test proves 9 passes, 8 fails for two 3-char entries.

Common confusions

"Frozen snapshot means writes are lost." No — writes hit disk immediately and are reflected in entries(). Only the in-prompt snapshot is frozen, and only until the next load(). Durability and prefix-cache stability are independent concerns the design keeps separate.
"§ in my note will corrupt the file." No — the store splits on the full '\n§\n' sequence, not a bare §. An entry whose body contains a lone § round-trips as one entry; there's a dedicated test for exactly that.