A system that ingests private chat logs, scrapes the web, and writes files an agent named must be paranoid by construction. ADR-0011 sets four standing constraints — fail-closed everywhere, PII-redaction before a byte leaves the machine, isolation of a leaked prompt corpus, and a clean-room rule — and they are not slogans: they show up as real guards in the code you've already met. This lesson connects the policy to the implementation: the Zod boundary, the realpath/path-traversal defense in the SkillStore, the fail-closed DEFAULT_TIER = T4, and PII redaction before the model call.
DEFAULT_TIER = T4 (unclassified work parks) and the never-throws contract (an unhandled failure becomes a typed denial, not a silent pass).| # | Constraint (ADR-0011) | Where it lives in code |
|---|---|---|
| 1 | Fail-closed everywhere security-relevant — realpath guards, HMAC webhooks, constant-time compares, Zod at every boundary | SkillStore path-safety; DEFAULT_TIER = T4; every subsystem's safeParse |
| 2 | PII redaction before egress — before the model call, not merely before emit | The funnel redacts private-channel signals pre-call (map §3) |
| 3 | CL4R1T4S isolation — the leaked vendor-prompt corpus is data to analyze, never a command to follow | Excluded from ingestion-as-instruction; treated as inert data |
| 4 | tac clean-room — patterns reimplemented from scratch, zero verbatim code/prompts, source never published | The whole fusion is from-scratch TS, not copied source |
You saw the SkillStore in Lesson 12. Its security spine is validateSupportPath — a pure function that refuses any relative path that could escape the skill's directory. It mirrors Hermes' has_traversal_component + _resolve_skill_target and is the textbook fail-closed validator: it lists what it allows and denies everything else.
// packages/hermes/src/skills/skill-store.ts:404-433 (condensed) const validateSupportPath = (relPath: string): Result<string, Error> => { if (relPath.length === 0) return err(new Error('file path is required.')); if (relPath.includes('\\')) return err(…'use forward slashes.'); // no backslash tricks if (relPath.startsWith('/')) return err(…'must be relative.'); // no absolute paths const segments = relPath.split('/').filter((s) => s.length > 0); for (const segment of segments) { if (segment === '..' || segment === '.') // no traversal segments return err(…'path traversal is not allowed.'); } const first = segments[0]; if (first === undefined || !isSupportDir(first)) // must be an ALLOWED subdir return err(…'first segment must be one of …'); return ok(normalized); // only now: a vetted, confined relative path };
Notice the shape: every branch is a denial except the final ok. An attacker-controlled ../../etc/passwd is rejected at the .. check; a sneaky references/../../secret is rejected too. The function is pure and never throws, so it composes into the Result world cleanly — security failures surface as typed, fail-closed errors (ADR-0011 §1, "not silent passes").
The deepest expression of fail-closed isn't a guard you call — it's the default. DEFAULT_TIER = T4 means any work that isn't explicitly classified as autonomous is parked, waiting for a human (Lesson 24, ADR-0005). The ADR draws the line itself: "this is also why DEFAULT_TIER = T4" — the unknown case denies. Unclassified autonomy is impossible by construction, not by remembering to check.
The subtle word is egress. It would be easy to redact PII just before showing a result to a user. ADR-0011 demands more: a Signal derived from a private channel (WhatsApp, Discord, Skool, Circle) is "PII-redacted before it leaves the local machine — before the model call, not merely before emit." The threat model assumes the model endpoint itself is outside the trust boundary, so raw private data must never be in a request payload at all.
The last two are about discipline, not runtime checks. CL4R1T4S isolation: a leaked corpus of vendor prompts (and its injection-payload README) "is isolated and never ingested as instruction; it is data to analyze, never a command to follow." This is prompt-injection defense at the ingestion layer — the corpus is inert text, never executed. tac clean-room: tac is an educational-license blueprint, so "its patterns are reimplemented from scratch, with zero verbatim code or prompts, and its source is never published." The entire @alembic/hermes fusion is a from-scratch TypeScript reimplementation precisely because of this rule — which is also why the lessons cite Alembic's own source, never Hermes' Python.
The CLAUDE.md orchestration rule "SEMPRE cite a fonte" and the content-addressed stores (SHA-256 over canonical JSON, Lesson 28) mean every ingested fact carries a source, a date, and a hash. Provenance isn't a separate feature — it's what lets the system know whether a piece of data is trusted (a vetted Learning) or suspect (a CL4R1T4S payload). Fail-closed + provenance is the same posture from two angles: deny the unknown, and always know where a thing came from.
validateSupportPath rejects .., absolute paths, and backslashes, allowing only paths under an approved subdir. What design pattern is this?err values. That's allow-listing / fail-closed — the same posture as DEFAULT_TIER = T4.