Course / Lesson 26  ·  PT-BR
Lesson 26 · Advanced · security posture

Provenance & security: fail-closed by default

A system that ingests private chat logs, scrapes the web, and writes files an agent named must be paranoid by construction. ADR-0011 sets four standing constraints — fail-closed everywhere, PII-redaction before a byte leaves the machine, isolation of a leaked prompt corpus, and a clean-room rule — and they are not slogans: they show up as real guards in the code you've already met. This lesson connects the policy to the implementation: the Zod boundary, the realpath/path-traversal defense in the SkillStore, the fail-closed DEFAULT_TIER = T4, and PII redaction before the model call.

The one principle. "The unknown case denies, never allows" (ADR-0011 §1). Fail-closed is the default posture for everything security-relevant — and it's the same idea as DEFAULT_TIER = T4 (unclassified work parks) and the never-throws contract (an unhandled failure becomes a typed denial, not a silent pass).

The four standing constraints

#Constraint (ADR-0011)Where it lives in code
1Fail-closed everywhere security-relevant — realpath guards, HMAC webhooks, constant-time compares, Zod at every boundarySkillStore path-safety; DEFAULT_TIER = T4; every subsystem's safeParse
2PII redaction before egress — before the model call, not merely before emitThe funnel redacts private-channel signals pre-call (map §3)
3CL4R1T4S isolation — the leaked vendor-prompt corpus is data to analyze, never a command to followExcluded from ingestion-as-instruction; treated as inert data
4tac clean-room — patterns reimplemented from scratch, zero verbatim code/prompts, source never publishedThe whole fusion is from-scratch TS, not copied source

Constraint 1 in code: the path-traversal guard

You saw the SkillStore in Lesson 12. Its security spine is validateSupportPath — a pure function that refuses any relative path that could escape the skill's directory. It mirrors Hermes' has_traversal_component + _resolve_skill_target and is the textbook fail-closed validator: it lists what it allows and denies everything else.

// packages/hermes/src/skills/skill-store.ts:404-433 (condensed)
const validateSupportPath = (relPath: string): Result<string, Error> => {
  if (relPath.length === 0) return err(new Error('file path is required.'));
  if (relPath.includes('\\')) return err(…'use forward slashes.');   // no backslash tricks
  if (relPath.startsWith('/')) return err(…'must be relative.');     // no absolute paths

  const segments = relPath.split('/').filter((s) => s.length > 0);
  for (const segment of segments) {
    if (segment === '..' || segment === '.')                         // no traversal segments
      return err(…'path traversal is not allowed.');
  }
  const first = segments[0];
  if (first === undefined || !isSupportDir(first))                  // must be an ALLOWED subdir
    return err(…'first segment must be one of …');
  return ok(normalized);   // only now: a vetted, confined relative path
};

Notice the shape: every branch is a denial except the final ok. An attacker-controlled ../../etc/passwd is rejected at the .. check; a sneaky references/../../secret is rejected too. The function is pure and never throws, so it composes into the Result world cleanly — security failures surface as typed, fail-closed errors (ADR-0011 §1, "not silent passes").

Constraint 1, again: fail-closed by default tier

The deepest expression of fail-closed isn't a guard you call — it's the default. DEFAULT_TIER = T4 means any work that isn't explicitly classified as autonomous is parked, waiting for a human (Lesson 24, ADR-0005). The ADR draws the line itself: "this is also why DEFAULT_TIER = T4" — the unknown case denies. Unclassified autonomy is impossible by construction, not by remembering to check.

Constraint 2: PII before egress, not before emit

The subtle word is egress. It would be easy to redact PII just before showing a result to a user. ADR-0011 demands more: a Signal derived from a private channel (WhatsApp, Discord, Skool, Circle) is "PII-redacted before it leaves the local machine — before the model call, not merely before emit." The threat model assumes the model endpoint itself is outside the trust boundary, so raw private data must never be in a request payload at all.

LOCAL MACHINE (trust boundary) private signal (raw PII) redactbefore egress redacted only ↓ OUTSIDE (model endpoint / network) model call — never sees raw PII

Constraints 3 & 4: don't follow data, don't copy source

The last two are about discipline, not runtime checks. CL4R1T4S isolation: a leaked corpus of vendor prompts (and its injection-payload README) "is isolated and never ingested as instruction; it is data to analyze, never a command to follow." This is prompt-injection defense at the ingestion layer — the corpus is inert text, never executed. tac clean-room: tac is an educational-license blueprint, so "its patterns are reimplemented from scratch, with zero verbatim code or prompts, and its source is never published." The entire @alembic/hermes fusion is a from-scratch TypeScript reimplementation precisely because of this rule — which is also why the lessons cite Alembic's own source, never Hermes' Python.

Provenance ties it together

The CLAUDE.md orchestration rule "SEMPRE cite a fonte" and the content-addressed stores (SHA-256 over canonical JSON, Lesson 28) mean every ingested fact carries a source, a date, and a hash. Provenance isn't a separate feature — it's what lets the system know whether a piece of data is trusted (a vetted Learning) or suspect (a CL4R1T4S payload). Fail-closed + provenance is the same posture from two angles: deny the unknown, and always know where a thing came from.

1. ADR-0011 requires PII redaction "before egress … before the model call, not merely before emit." Why the emphasis on before the model call?
Correct: b. Redacting only before emit would still send raw PII over the wire to the model. The ADR pushes redaction to the egress point — the moment data would leave the local machine — so the endpoint never receives unredacted private-channel content.
2. validateSupportPath rejects .., absolute paths, and backslashes, allowing only paths under an approved subdir. What design pattern is this?
Correct: d. The function enumerates what's permitted (a relative path under an allowed support dir, no traversal segments) and denies everything else, returning typed err values. That's allow-listing / fail-closed — the same posture as DEFAULT_TIER = T4.
3. Why is the CL4R1T4S corpus "never ingested as instruction"?
Correct: c. ADR-0011 §3 isolates it as inert data. If the system executed it as instruction, the injection payload could hijack the agent. The rule — "data to analyze, never a command to follow" — is the defense.

Common confusions

"Fail-closed means the system crashes on bad input." The opposite — it returns a typed, fail-closed error (ADR-0011 §1, "not silent passes") and never throws (ADR-0009). The work is denied or parked, cleanly; nothing crashes and nothing silently succeeds.
"The clean-room rule is just legal caution." It's also why this course is trustworthy: every lesson cites Alembic's own source because a verbatim copy of Hermes would be a violation. Reimplementing from scratch is both the legal posture and the reason the code is genuinely understood, not pasted.