Lesson 26 · Advanced · security posture

Provenance & security: fail-closed by default

A system that ingests private chat logs, scrapes the web, and writes files an agent named must be paranoid by construction. ADR-0011 sets four standing constraints — fail-closed everywhere, PII-redaction before a byte leaves the machine, isolation of a leaked prompt corpus, and a clean-room rule — and they are not slogans: they show up as real guards in the code you've already met. This lesson connects the policy to the implementation: the Zod boundary, the realpath/path-traversal defense in the SkillStore, the fail-closed DEFAULT_TIER = T4, and PII redaction before the model call.

The one principle. "The unknown case denies, never allows" (ADR-0011 §1). Fail-closed is the default posture for everything security-relevant — and it's the same idea as DEFAULT_TIER = T4 (unclassified work parks) and the never-throws contract (an unhandled failure becomes a typed denial, not a silent pass).

The four standing constraints

#	Constraint (ADR-0011)	Where it lives in code
1	Fail-closed everywhere security-relevant — `realpath` guards, HMAC webhooks, constant-time compares, Zod at every boundary	SkillStore path-safety; `DEFAULT_TIER = T4`; every subsystem's `safeParse`
2	PII redaction before egress — before the model call, not merely before emit	The funnel redacts private-channel signals pre-call (map §3)
3	CL4R1T4S isolation — the leaked vendor-prompt corpus is data to analyze, never a command to follow	Excluded from ingestion-as-instruction; treated as inert data
4	tac clean-room — patterns reimplemented from scratch, zero verbatim code/prompts, source never published	The whole fusion is from-scratch TS, not copied source

Constraint 1 in code: the path-traversal guard

You saw the SkillStore in Lesson 12. Its security spine is validateSupportPath — a pure function that refuses any relative path that could escape the skill's directory. It mirrors Hermes' has_traversal_component + _resolve_skill_target and is the textbook fail-closed validator: it lists what it allows and denies everything else.

// packages/hermes/src/skills/skill-store.ts:404-433 (condensed)
const validateSupportPath = (relPath: string): Result<string, Error> => {
  if (relPath.length === 0) return err(new Error('file path is required.'));
  if (relPath.includes('\\')) return err(…'use forward slashes.');   // no backslash tricks
  if (relPath.startsWith('/')) return err(…'must be relative.');     // no absolute paths

  const segments = relPath.split('/').filter((s) => s.length > 0);
  for (const segment of segments) {
    if (segment === '..' || segment === '.')                         // no traversal segments
      return err(…'path traversal is not allowed.');
  }
  const first = segments[0];
  if (first === undefined || !isSupportDir(first))                  // must be an ALLOWED subdir
    return err(…'first segment must be one of …');
  return ok(normalized);   // only now: a vetted, confined relative path
};

Notice the shape: every branch is a denial except the final ok. An attacker-controlled ../../etc/passwd is rejected at the .. check; a sneaky references/../../secret is rejected too. The function is pure and never throws, so it composes into the Result world cleanly — security failures surface as typed, fail-closed errors (ADR-0011 §1, "not silent passes").

Constraint 1, again: fail-closed by default tier

The deepest expression of fail-closed isn't a guard you call — it's the default. DEFAULT_TIER = T4 means any work that isn't explicitly classified as autonomous is parked, waiting for a human (Lesson 24, ADR-0005). The ADR draws the line itself: "this is also why DEFAULT_TIER = T4" — the unknown case denies. Unclassified autonomy is impossible by construction, not by remembering to check.

Constraint 2: PII before egress, not before emit

The subtle word is egress. It would be easy to redact PII just before showing a result to a user. ADR-0011 demands more: a Signal derived from a private channel (WhatsApp, Discord, Skool, Circle) is "PII-redacted before it leaves the local machine — before the model call, not merely before emit." The threat model assumes the model endpoint itself is outside the trust boundary, so raw private data must never be in a request payload at all.

Constraints 3 & 4: don't follow data, don't copy source

The last two are about discipline, not runtime checks. CL4R1T4S isolation: a leaked corpus of vendor prompts (and its injection-payload README) "is isolated and never ingested as instruction; it is data to analyze, never a command to follow." This is prompt-injection defense at the ingestion layer — the corpus is inert text, never executed. tac clean-room: tac is an educational-license blueprint, so "its patterns are reimplemented from scratch, with zero verbatim code or prompts, and its source is never published." The entire @alembic/hermes fusion is a from-scratch TypeScript reimplementation precisely because of this rule — which is also why the lessons cite Alembic's own source, never Hermes' Python.

Provenance ties it together

The CLAUDE.md orchestration rule "SEMPRE cite a fonte" and the content-addressed stores (SHA-256 over canonical JSON, Lesson 28) mean every ingested fact carries a source, a date, and a hash. Provenance isn't a separate feature — it's what lets the system know whether a piece of data is trusted (a vetted Learning) or suspect (a CL4R1T4S payload). Fail-closed + provenance is the same posture from two angles: deny the unknown, and always know where a thing came from.

1. ADR-0011 requires PII redaction "before egress … before the model call, not merely before emit." Why the emphasis on before the model call?

Correct: b. Redacting only before emit would still send raw PII over the wire to the model. The ADR pushes redaction to the egress point — the moment data would leave the local machine — so the endpoint never receives unredacted private-channel content.

2. validateSupportPath rejects .., absolute paths, and backslashes, allowing only paths under an approved subdir. What design pattern is this?

Correct: d. The function enumerates what's permitted (a relative path under an allowed support dir, no traversal segments) and denies everything else, returning typed err values. That's allow-listing / fail-closed — the same posture as DEFAULT_TIER = T4.

3. Why is the CL4R1T4S corpus "never ingested as instruction"?

Correct: c. ADR-0011 §3 isolates it as inert data. If the system executed it as instruction, the injection payload could hijack the agent. The rule — "data to analyze, never a command to follow" — is the defense.

Common confusions

"Fail-closed means the system crashes on bad input." The opposite — it returns a typed, fail-closed error (ADR-0011 §1, "not silent passes") and never throws (ADR-0009). The work is denied or parked, cleanly; nothing crashes and nothing silently succeeds.

"The clean-room rule is just legal caution." It's also why this course is trustworthy: every lesson cites Alembic's own source because a verbatim copy of Hermes would be a violation. Reimplementing from scratch is both the legal posture and the reason the code is genuinely understood, not pasted.

← Lesson 25 Lesson 27 →

Sources (read verbatim):
· docs/adr/0011-security-and-provenance.md — the four constraints (fail-closed §1, PII-before-egress §2, CL4R1T4S §3, tac clean-room §4), "unknown case denies," "why DEFAULT_TIER = T4," typed fail-closed errors not silent passes.
· packages/hermes/src/skills/skill-store.ts — validateSupportPath (404–433): rejects empty/absolute/backslash/../. and non-allowed first segment; has_traversal_component provenance note (21–23, 397–402).
· docs/alembic-complete-map.md §3 — PII redaction before egress in the funnel; tier.ts DEFAULT_TIER = T4 fail-closed (§7.1).
· ADR-0009 (never-throws, the substrate for typed denials). ← Course hub · Português