feat: skills pinning — supply chain & update drift defense (AST 02 + AST 07)#89
feat: skills pinning — supply chain & update drift defense (AST 02 + AST 07)#89Xaik89 wants to merge 3 commits intonode9-ai:mainfrom
Conversation
…AST 07) Extends the MCP tool pinning primitive (v1.5.0 / PR node9-ai#81) to agent skill repositories. On the first tool call of a session, SHA-256 hashes are recorded for every skill file in known roots (~/.claude/skills/, ~/.claude/CLAUDE.md, ~/.claude/rules/, project .claude/CLAUDE.md, .claude/CLAUDE.local.md, .claude/rules/, .cursor/rules/, AGENTS.md, CLAUDE.md). On subsequent sessions the hook re-verifies; any drift quarantines the session and blocks every tool call until a human reviews via `node9 skill pin update <rootKey>`. One feature, two threats covered in a single primitive — AST 02 Supply Chain Compromise and AST 07 Update Drift — at the skills layer that no competitor currently defends at runtime. Security properties: - Fail-closed on corrupt pin file - Symlink-safe (never follows symlinks out of the tree) - Size-capped at 5000 files / 50 MB per root - Path-traversal-safe session IDs (/^[A-Za-z0-9_-]{1,128}$/) - Atomic writes, mode 0o600 for ~/.node9/skill-pins.json and flags CLI: - `node9 skill pin list` — show pinned roots, hashes, file counts - `node9 skill pin update <rootKey> [--yes]` — diff + re-pin - `node9 skill pin reset` — clear pins AND wipe session flags Config: - `policy.skillRoots: string[]` extends the default root set (absolute, `~/`-prefixed, or cwd-relative; relative paths require absolute cwd). Tests (TDD, all green): - src/__tests__/skill-pin.unit.test.ts (36 tests) - src/__tests__/skill-pin-cli.integration.test.ts (7 tests, spawnSync) - src/__tests__/check-skill-pin.integration.test.ts (8 tests, spawnSync) - src/__tests__/skill-roots-config.spec.ts (4 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Feedback: ~2k lines was too much. Trim to the security-essential surface while preserving the AST 02 + AST 07 coverage. Dropped (not essential to the rug-pull defense): - computePinDiff + fileManifest tracking in pin entries - @inquirer/prompts dependency (no interactive review prompt) - per-file diff display in `skill pin update` - `update` now mirrors `mcp pin update` exactly: just removes the pin so the next session re-pins with current state - GC loop for session flags older than 7 days (not security-essential) - 4 integration tests covering short-circuit / unchanged-session / quarantine-persists / relative-cwd edge cases — the 4 load-bearing scenarios remain: first-call pins, drift blocks, corrupt fails closed, missing session_id skips Net: -945 / +258 lines. PR now ~1150 lines (down from ~1835). All security properties preserved: fail-closed on corrupt pin file, symlink-safe, size-capped, path-traversal-safe session IDs, atomic writes with mode 0o600, per-session memoisation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Feedback: section was too long. Collapse to one paragraph that cross-references the MCP pinning section above (same pattern, same defense, different target) instead of restating the full rationale. 14 lines → 3. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Really solid engineering on this — the security properties (symlink-safe, atomic writes, path-traversal-safe session IDs, fail-closed on corrupt state, session memoization) are all correct, and the test coverage is thorough. You clearly read the MCP pinning code carefully and mirrored the pattern well. That said, I'm going to close this for now because of a fundamental design issue that I don't think can be fixed with small tweaks. The quarantine-on-change model doesn't fit skill files. MCP tool pinning works because you didn't write those tool definitions — a third-party server did. When they change without your knowledge, that's suspicious by definition. Skill files are different: ~/.claude/CLAUDE.md, roject/.claude/CLAUDE.md, .claude/rules/ are files you edit constantly as part of normal workflow. The feature as designed would quarantine your next Claude Code session every time you update your own CLAUDE.md. That trains users to either disable it or treat every alert as noise — which defeats the The threat model is also narrower than it looks. The realistic attack here is a compromised package or CI script silently modifying your CLAUDE.md. That's a real threat, but it's rare. The false positive rate (you edited your own file) would overwhelm real signals in practice. What I'd want to see instead:
If you want to rework it along those lines, I'd be happy to look at a v2. The core hashing and session memoization logic is genuinely good work and worth reusing. Help me with could agent see this |
Addresses PR node9-ai#89 review feedback: - enabled: false by default (opt-in only) - mode 'warn' (default): /dev/tty notification on drift, tool allowed (exit 0) - mode 'block': hard quarantine on drift, tool blocked (exit 2) — for installed/registry skills where changes are genuinely suspicious Config: policy.skillPinning: { enabled, mode: 'warn'|'block', roots } replaces the flat policy.skillRoots field. Session flags now have three states: 'verified', 'warned', 'quarantined'. Warn mode never writes 'quarantined' — warn + allow + memoize so the /dev/tty notification shows once per session, not on every tool call. Core hashing, CLI, and security properties are unchanged from v1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
~/.claude/skills/,~/.claude/CLAUDE.md,~/.claude/rules/, project.claude/CLAUDE.md,.claude/CLAUDE.local.md,.claude/rules/,.cursor/rules/,AGENTS.md,CLAUDE.md. On the first tool call of a session the hook records SHA-256 hashes of every skill file; subsequent sessions re-hash and compare. Any drift quarantines the session and blocks every tool call until a human reviews withnode9 skill pin update <rootKey>.src/mcp-pin.ts1:1; no new dependencies.(Re-opened from personal fork; supersedes closed #88.)
What's new
src/skill-pin.ts(~300 lines) — SHA-256 content hashing (files or recursive directories, symlink-safe, 5000 files / 50 MB caps), atomic writes to~/.node9/skill-pins.json(mode 0o600), per-rootexistsflag so "appeared" and "vanished" both count as drift, batchedverifyAndPinRoots().src/cli/commands/skill-pin.ts(~100 lines) —node9 skill pin list | update <rootKey> | reset, mirroringnode9 mcp pin.updateremoves the named pin so the next session re-pins;resetclears all pins and wipes session verification flags.src/cli/commands/check.ts(+117 lines) — first tool call of a session verifies; result memoised in~/.node9/skill-sessions/<session_id>.jsonso hashing runs once per session, not per tool call. Session IDs restricted to/^[A-Za-z0-9_-]{1,128}$/to defeat path traversal. Corrupt pin file → fail-closed. Unexpected errors → fail-open (debug-logged) so a bug here cannot brick Claude Code.policy.skillRoots: string[]config field — user-extensible list of extra skill paths.Security properties
skill-pins.json(recovery:node9 skill pin reset)lstat+ explicitisSymbolicLink(); never follows out of the tree)Test plan
skill-pin.unit.test.ts(16): hash contract, order-invariance, add/remove/modify sensitivity, symlink skip, pin round-trip, exists-flip detection, mode 0o600, fail-closedskill-pin-cli.integration.test.ts(6,spawnSync): list empty/populated/corrupt, update unknown/known, reset clears + wipes session flagscheck-skill-pin.integration.test.ts(4,spawnSync): first-call pins + allows, new-session drift blocks with JSON-RPC payload + quarantines, corrupt fails closed, missingsession_idskipsskill-roots-config.spec.ts(4): default empty, merge, dedup, defensive filternpm test→ 1170 / 1171 pass (the one failure is a pre-existing environmental flake inhud.spec.ts— fails whenever~/.claude/CLAUDE.mdexists on the test machine; passes under isolated HOME)npm run typecheck,npm run lint,npm run format:check,npm run build— all cleanEnd-to-end verified against a realistic scenario
Simulated a developer's real setup (global
~/.claude/CLAUDE.md+ skill, projectCLAUDE.md+.cursor/rules/) across two days of Claude Code sessions:my-project/CLAUDE.mdwith a prompt-injection backdoordecision: "block", reason names the exactnode9 skill pin update b73aad6c0b08ef42recovery command, session flag persists as{state: "quarantined", detail: "changed: .../my-project/CLAUDE.md"}✅skill pin update <key>→ fresh session allows again ✅skill pin reset→ clears all pins + wipesskill-sessions/✅🤖 Generated with Claude Code