Skip to content

Dev#4

Merged
node9ai merged 12 commits intomainfrom
dev
Mar 10, 2026
Merged

Dev#4
node9ai merged 12 commits intomainfrom
dev

Conversation

@node9ai
Copy link
Copy Markdown
Contributor

@node9ai node9ai commented Mar 10, 2026

Summary

What does this PR do? Reference any related issues (e.g. Closes #123).

Type of change

  • Bug fix
  • New feature
  • Refactor / code cleanup
  • Documentation
  • Tests

Checklist

  • npm test passes
  • npm run typecheck passes
  • npm run lint passes
  • Tests added/updated for new behavior
  • CHANGELOG.md updated (for user-facing changes)

nadavis and others added 12 commits March 8, 2026 15:47
…er than just monitoring the Server's (responses). Dangerous actions are now caught _before_ they reach the target server.
…nit crash

- Race condition: autoStartDaemonAndWait now verifies HTTP readiness via
  GET /settings before returning true, preventing stale-PID false positives
- Race condition: openBrowserLocal() called immediately after daemon is
  HTTP-ready so browser starts loading before POST /check fires, ensuring
  the SSE 'add' event is delivered to an already-connected client
- Race condition: daemon skips openBrowser() when autoStarted=true to
  avoid duplicate tabs (CLI already opened the browser)
- Race condition: 'Abandoned' browser racer result now resolves the race
  as denied instead of being silently swallowed (caused CLI to hang)
- Race condition: SSE reconnect abandon timer raised 2s→10s so a page
  reload doesn't abandon pending requests before the browser reconnects
- Bug fix: cloudBadge null check in SSE 'init' handler — missing DOM
  element crashed the handler before addCard() ran, causing approval
  requests to never appear when browser was cold-started
- Undo engine: moved snapshot trigger from PostToolUse (log) to
  PreToolUse (check) so snapshot captures state before AI change, not
  after (previous timing made undo a no-op)
- Undo engine: applyUndo now deletes files created after the snapshot
  that git restore alone does not remove
- Undo engine: expanded STATE_CHANGING_TOOLS list to include
  str_replace_based_edit_tool and create_file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously only tracked files (git ls-files) were checked for deletion,
so files created after the snapshot but never committed (e.g. test.txt)
survived the undo. Now also queries git ls-files --others --exclude-standard
to catch untracked non-ignored files — the same set git add -A captures
when building the snapshot tree.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a local channel (native popup, browser dashboard, terminal) wins
the approval race while cloud is also enforced, the pending SaaS request
was never resolved — leaving Mission Control stuck on PENDING forever.

Now finish() calls resolveNode9SaaS() (PATCH /intercept/requests/:id)
whenever checkedBy !== 'cloud' and a cloudRequestId exists, closing the
request immediately with the correct APPROVED/DENIED status.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
core.ts:
- Fire-and-forget POST /intercept/audit for all local fast-path allows
  (ignoredTools, sandboxPaths, local-policy, trust) — gives org admins
  full visibility of calls that never reached the cloud
- Fixed config merge: sandboxPaths and ignoredTools now concatenate across
  layers (global → project → local); dangerousWords replaces (higher wins)
- agentVersion context now sent as context.agent so backend can store AI
  client type (Claude Code, Gemini CLI, Terminal) separately from machine identity

cli.ts:
- Updated context payload to include agent type metadata for accurate
  per-client breakdown in Mission Control Agents tab

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- feat: → minor bump (0.2.x → 0.3.0)
- fix:/perf:/refactor: → patch bump (0.2.1 → 0.2.2)
- BREAKING CHANGE → major bump (0.x → 1.0.0)
- docs/chore/test/style: → no release
Releases trigger on merge to main only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@node9ai node9ai merged commit 5ae7474 into main Mar 10, 2026
3 checks passed
@node9ai node9ai deleted the dev branch March 10, 2026 11:59
@node9ai node9ai restored the dev branch March 10, 2026 12:11
@node9ai
Copy link
Copy Markdown
Contributor Author

node9ai commented Mar 10, 2026

🎉 This PR is included in version 1.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

node9ai added a commit that referenced this pull request Mar 28, 2026
…, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
node9ai added a commit that referenced this pull request Mar 28, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {.…
node9ai added a commit that referenced this pull request Mar 28, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {..…
node9ai added a commit that referenced this pull request Mar 28, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {..…
node9ai added a commit that referenced this pull request Mar 28, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {.…
node9ai added a commit that referenced this pull request Mar 29, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {.…
node9ai added a commit that referenced this pull request Mar 30, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {..…
node9ai added a commit that referenced this pull request Mar 30, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {..…
node9ai added a commit that referenced this pull request Mar 31, 2026
* fix: address code review — Slack regex bound, remove redundant parser, notMatchesGlob consistency, applyUndo empty-set guard

- dlp: cap Slack token regex at {1,100} to prevent unbounded scan on crafted input
- core: remove 40-line manual paren/bracket parser from validateRegex — redundant
  with the final new RegExp() compile check which catches the same errors cleaner
- core: fix notMatchesGlob — absent field returns true (vacuously not matching),
  consistent with notContains; missing cond.value still fails closed
- undo: guard applyUndo against ls-tree failure returning empty set, which would
  cause every file in the working tree to be deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — compile-before-saferegex, Slack lower bound, ls-tree guard logging, missing tests

- core: move new RegExp() compile check BEFORE safe-regex2 so structurally invalid
  patterns (unbalanced parens/brackets) are rejected before reaching NFA analysis
- dlp: tighten Slack token lower bound from {1,100} to {20,100} to reduce false
  negatives on truncated tokens
- undo: add NODE9_DEBUG log before early return in applyUndo ls-tree guard for
  observability into silent failures
- test(core): add 'structurally malformed patterns still rejected' regression test
  confirming compile-check order after manual parser removal
- test(core): add notMatchesGlob absent-field test with security comment documenting
  the vacuous-true behaviour and how to guard against it
- test(undo): add applyUndo ls-tree non-zero exit test confirming no files deleted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): swap spawnResult args order — stdout first, status second

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix prettier formatting in undo.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: revert notMatchesGlob to fail-closed, warn on ls-tree failure, document empty-stdout gap

- core: revert notMatchesGlob absent-field to fail-closed (false) — an attacker
  omitting a field must not satisfy a notMatchesGlob allow rule; rule authors
  needing pass-when-absent must pair with an explicit 'notExists' condition
- undo: log ls-tree failure unconditionally to stderr (not just NODE9_DEBUG) since
  this is an unexpected git error, not normal flow — silent false is undebuggable
- dlp: add comment on Slack token bound rationale (real tokens ~50–80 chars)
- test(core): fix notMatchesGlob fail-closed test — use delete_file (dangerous word)
  so the allow rule actually matters; write was allowed by default regardless
- test(undo): add test documenting the known gap where ls-tree exits 0 with empty
  stdout still produces an empty snapshotFiles set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(undo): guard against ls-tree status-0 empty-stdout mass-delete footgun

Add snapshotFiles.size === 0 check after the non-zero exit guard. When ls-tree
exits 0 but produces no output, snapshotFiles would be empty and every tracked
file in the working tree would be deleted. Abort and warn unconditionally instead.

Also convert the 'known gap' documentation test into a real regression test that
asserts false return and no unlinkSync calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(undo): assert stderr warning in ls-tree failure tests

Add vi.spyOn(process.stderr, 'write') assertions to both new applyUndo tests
to verify the observability messages are actually emitted on failure paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: banner to stderr for MCP stdio compat; log command cwd handling and error visibility

Two bugs from issue #33:

1. runProxy banner went to stdout via console.log, corrupting the JSON-RPC stream
   for stdio-based MCP servers. Fixed: console.error so stdout stays clean.

2. 'node9 log' PostToolUse hook was silently swallowing all errors (catch {})
   and not changing to payload.cwd before getConfig() — unlike the 'check'
   command which does both. If getConfig() loaded the wrong project config,
   shouldSnapshot() could throw on a missing snapshot policy key, silently
   killing the audit.log write with no diagnostic output.
   Fixed: add cwd + _resetConfigCache() mirroring 'check'; surface errors to
   hook-debug.log when enableHookLogDebug is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate process.chdir race condition in hook commands

Pass payload.cwd directly to getConfig(cwd?) instead of calling
process.chdir() which mutates process-global state and would race
with concurrent hook invocations.

- getConfig() gains optional cwd param: bypasses cache read/write
  when an explicit project dir is provided, so per-project config
  lookups don't pollute the ambient interactive-CLI cache
- check and log commands: remove process.chdir + _resetConfigCache
  blocks; pass payload.cwd directly to getConfig()
- log command catch block: remove getConfig() re-call (could re-throw
  if getConfig() was the original error source); use NODE9_DEBUG only
- Remove now-unused _resetConfigCache import from cli.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always write LOG_ERROR to hook-debug.log; clarify ReDoS test intent

- log catch block: remove NODE9_DEBUG guard — this catch guards the
  audit trail so errors must always be written to hook-debug.log,
  not only when NODE9_DEBUG=1
- validateRegex test: rename and expand the safe-regex2 NFA test to
  explicitly assert that (a+)+ compiles successfully (passes the
  compile-first step) yet is still rejected by safe-regex2, confirming
  the reorder did not break ReDoS protection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(mcp): integration tests for #33 regression coverage

Add mcp.integration.test.ts with 4 tests covering both bugs from #33:

1. Proxy stdout cleanliness (2 tests):
   - banner goes to stderr; stdout contains only child process output
   - stdout stays valid JSON when child writes JSON-RPC — banner does not corrupt stream

2. Log command cross-cwd audit write (2 tests):
   - writes to audit.log when payload.cwd differs from process.cwd() (the actual #33 bug)
   - writes to audit.log when no cwd in payload (backward compat)

These tests would have caught both regressions at PR time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — cwd guard, test assertions, exit-0 comment

- getConfig(payload.cwd || undefined): use || instead of ?? to also
  guard against empty string "" which path.join would silently treat
  as relative-to-cwd (same behaviour as the fallback, but explicit)
- log catch block: add comment documenting the intentional exit(0)-on-
  audit-failure tradeoff — non-zero would incorrectly signal tool failure
  to Claude/Gemini since the tool already executed
- mcp.integration.test.ts: assert result.error and result.status on
  every spawnSync call so spawn failures surface loudly instead of
  silently matching stdout === '' checks
- mcp.integration.test.ts: add expect(result.stdout.trim()).toBeTruthy()
  before JSON.parse for clearer diagnostic on stdout-empty failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add CLAUDE.md rules and pre-commit enforcement hook

CLAUDE.md: documents PR checklist, test rules, and code rules that
Claude Code reads automatically at the start of every session:
- PR checklist (tests, typecheck, format, no console.log in hooks)
- Integration test requirements for subprocess/stdio/filesystem code
- Architecture notes (getConfig(cwd?), audit trail, DLP, fail-closed)

.git/hooks/pre-commit: enforces the checklist on every commit:
- Blocks console.log in src/cli, src/core, src/daemon
- Runs npm run typecheck
- Runs npm run format:check
- Runs npm test when src/ implementation files are changed
- Emergency bypass: git commit --no-verify

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — test isolation, stderr on audit gap, nonexistent cwd

- mcp.integration.test.ts: replace module-scoped tempDirs with per-describe
  beforeEach/afterEach and try/finally — eliminates shared-array interleave
  risk if tests ever run with parallelism
- mcp.integration.test.ts: add test for nonexistent payload.cwd — verifies
  getConfig falls back to global config gracefully instead of throwing
- cli.ts log catch: emit [Node9] audit log error to stderr so audit gaps
  surface in the tool output stream without requiring hook-debug.log checks
- core.ts getConfig: add comment documenting intentional nonexistent-cwd
  fallback behavior (tryLoadConfig returns null → global config used)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: audit write before config load; validate cwd; test corrupt-config gap

Two blocking issues from review:

1. getConfig() was called BEFORE appendFileSync — a config load failure
   (corrupt JSON, permissions error) would throw and skip the audit write,
   reintroducing the original silent audit gap. Fixed by moving the audit
   write unconditionally before the config load.

2. payload.cwd was passed to getConfig() unsanitized — a crafted hook
   payload with a relative or traversal path could influence which
   node9.config.json gets loaded. Fixed with path.isAbsolute() guard;
   non-absolute cwd falls back to ambient process.cwd().

Also:
- Add integration test proving audit.log is written even when global
  config.json is corrupt JSON (regression test for the ordering fix)
- Add comment on echo tests noting Linux/macOS assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): add audit-write ordering and path validation rules

* test: skip echo proxy tests on Windows; clarify exit-0 contract

- itUnix = it.skipIf(process.platform === 'win32') applied to both proxy
  echo tests — Windows echo is a shell builtin and cannot be spawned
  directly, so these tests would fail with a spawn error instead of
  skipping cleanly
- corrupt-config test: add comment documenting that exit(0) is the
  correct exit code even on config error — the log command always exits 0
  so Claude/Gemini do not treat an already-completed tool call as failed;
  the audit write precedes getConfig() so it succeeds regardless

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CODEOWNERS for CLAUDE.md; parseAuditLog helper; getConfig unit tests

- .github/CODEOWNERS: require @node9-ai/maintainers review on CLAUDE.md
  and security-critical source files — prevents untrusted PRs from
  silently weakening AI instruction rules or security invariants
- mcp.integration.test.ts: replace inline JSON.parse().map() with
  parseAuditLog() helper that throws a descriptive error when a log line
  is not valid JSON (e.g. a debug line or partial write), instead of an
  opaque SyntaxError with no context
- mcp.integration.test.ts: itUnix declaration moved after imports for
  correct ordering
- core.test.ts: add getConfig unit tests verifying that a nonexistent
  explicit cwd does not throw (tryLoadConfig fallback), and that
  getConfig(cwd) does not pollute the ambient no-arg cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): add npm run lint to PR checklist and pre-commit hook

Adds ESLint step to CLAUDE.md checklist and .git/hooks/pre-commit so
require()-style imports and other lint errors are caught before push.
Also fixes the require('path')/require('os') inline calls in core.test.ts
that triggered @typescript-eslint/no-require-imports in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: emit shields-status on SSE connect — dashboard no longer stuck on Loading

The shields-status event was only broadcast on toggle (POST /shields/toggle).
A freshly connected dashboard never received the current shields state and
displayed "Loading…" indefinitely.

Fix: send shields-status in the GET /events initial payload alongside init
and decisions, using the same payload shape as the toggle handler.

Regression test: daemon.integration.test.ts starts a real daemon with an
isolated HOME, connects to /events, and asserts shields-status is present
with the correct active state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — shared SSE snapshot, ctx.skip() for visible skips

- Capture SSE stream once in beforeAll and share across all three tests
  instead of opening 3 separate 1.5s connections (~4.5s → ~1.5s wall time)
- Replace early return with ctx.skip() so port-conflict skips are visible
  in the Vitest report rather than silently passing
- Add comment explaining why it.skipIf cannot be used here (condition
  depends on async beforeAll, evaluated after test collection)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: address review — bump SSE timeout, guard payload undefined, structural shield check

- Bump readSseStream timeout 1500ms → 3000ms for slow CI headroom
- Assert payload defined before accessing .shields — gives a clear failure
  message if shields-status is absent rather than a TypeError on .shields
- Replace hardcoded postgres check with structural loop over all shields
  so the test survives adding or renaming shields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: log last waitForDaemon error to stderr for CI diagnostics

Silent catch{} meant a crashed daemon (e.g. EACCES on port) produced only
"did not start within 6s" with no hint of the root cause. Now the last
caught error is written to stderr so CI logs show the actual failure reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass flags through to wrapped command — prevent Commander from consuming -y, --config etc.

Commander parsed flags like -y and --config as node9 options and errored
with "unknown option" before the proxy action handler ran. This broke all
MCP server configurations that pass flags to the wrapped binary (npx -y,
binaries with --nexus-url, etc.).

Fix: before program.parse(), detect proxy mode (first arg is not a known
node9 subcommand and doesn't start with '-') and inject '--' into process.argv.
This causes Commander to stop option-parsing and pass everything — including
flags — through to the variadic [command...] action handler intact.

The user-visible '--' workaround still works and is now redundant but harmless.

Regression tests: two new itUnix cases in mcp.integration.test.ts verify
that -n is not consumed as a node9 flag, and that --version reaches the
wrapped command rather than printing node9's own version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: derive proxy subcommand set from program.commands; harden test assertions

- Replace hand-maintained KNOWN_SUBCOMMANDS allowlist with a set derived
  from program.commands.map(c => c.name()) — stays in sync automatically
  when new subcommands are added, eliminating the latent sync bug
- Remove fragile echo stdout assertion in flag pass-through test — echo -n
  and echo --version behaviour varies across platforms (GNU vs macOS);
  the regression being tested is node9's parser, not echo's output
- Add try/finally in daemon.integration.test.ts beforeAll so tmpHome is
  always cleaned up even if daemon startup throws

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard against double '--' injection; strengthen --version test assertion

- Skip '--' injection if process.argv[2] is already '--' to avoid
  producing ['--', '--', ...] when user explicitly passes the separator
- Add toBeTruthy() assertion on stdout in --version test so the check
  fails if echo exits non-zero with empty output rather than silently passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — alias gap comment, res error-after-destroy guard, echo comment

- cli.ts: document alias gap (no aliases currently, but note how to extend)
- daemon.integration.test.ts: settled flag prevents res 'error' firing reject
  after Promise already resolved via req.destroy() timeout path
- mcp.integration.test.ts: fix comment — /bin/echo handles --version, not GNU echo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent daemon crash on unhandled rejection — node9 tail disconnect with 2 agents

Two concurrent Claude instances fire overlapping hook calls. Any unhandled
rejection in the async request handler crashes the daemon (Node 15+ default),
which closes all SSE connections and exits node9 tail with "Daemon disconnected".

- Add process.on('unhandledRejection') so a single bad request never kills the daemon
- Wrap GET /settings and GET /slack-status getGlobalSettings() calls in try/catch
  (were the only routes missing error guards in the async handler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — return in catch blocks, log errors, guard unhandledRejection registration

- GET /settings and /slack-status catch blocks now return after writeHead(500)
  to prevent fall-through to subsequent route handlers (write-after-end risk)
- Log the actual error to stderr in both catch blocks — silent swallow is
  dangerous in a security daemon
- Guard unhandledRejection registration with listenerCount === 0 to prevent
  double-registration if startDaemon() is called more than once (tests/restarts)
- Move handler registration before server.listen() for clearer startup ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(CLAUDE.md): require manual diff review before every commit

Automated checks (lint, typecheck, tests) don't catch logical correctness
issues like missing return after res.end(), silent catch blocks, or
double event-listener registration. Explicitly require git diff review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address review — 500 responses, module-level rejection flag, override cli.ts exit handler

- Separate res.writeHead(500) and res.end() calls (non-idiomatic chaining)
- Add Content-Type: application/json and JSON body to 500 responses
- Replace listenerCount guard with module-level boolean flag (race-safe)
- Call process.removeAllListeners('unhandledRejection') before registering
  daemon handler — cli.ts registers a handler that calls process.exit(1),
  which was the actual crash source; this overrides it for the daemon process
- Document that critical approval path (POST /check) has its own try/catch
  and is not relying on this safety net

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove removeAllListeners — use isDaemon guard in cli.ts handler instead

removeAllListeners('unhandledRejection') was a blunt instrument that could
strip handlers registered by third-party deps. The correct fix:
- cli.ts handler now returns early (no-op) when process.argv[2] === 'daemon',
  leaving the rejection to the daemon's own keep-alive handler
- daemon/index.ts no longer needs removeAllListeners
- daemon handler now logs stack trace so systematic failures are visible

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify unhandledRejection listener interaction — both handlers fire independently

The previous comment implied listener-chain semantics (one handler deferring
to the next). Node.js fires all registered listeners independently. The
isDaemon no-op return in cli.ts is what prevents process.exit(1), not any
chain mechanism. Clarify this so future maintainers don't break it by
restructuring the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: eliminate unhandledRejection ordering dependency — skip cli.ts handler for daemon mode

Instead of relying on listener registration order (fragile), skip registering
the cli.ts exit-on-rejection handler entirely when process.argv[2] === 'daemon'.
The daemon's own keep-alive handler in startDaemon() is then the only handler
in the process — no ordering dependency, no removeAllListeners needed.

Also update stale comment in daemon/index.ts that still described the old
"we must replace the cli.ts handler" approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: address review comments — argv load-time note, hung-connection limit, stack trace caveat

- cli.ts: note that process.argv[2] check fires at module load time intentionally
- daemon/index.ts: document hung-connection limitation of last-resort rejection handler
- daemon/index.ts: note stack trace may include user input fragments (acceptable
  for localhost-only stderr logging)
- daemon/index.ts: clarify jest.resetModules() behavior with the module-level flag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Safe by Default — advisory SQL rules block destructive ops without config

Adds review-drop-table-sql, review-truncate-sql, and review-drop-column-sql
to ADVISORY_SMART_RULES so DROP TABLE, TRUNCATE TABLE, and DROP COLUMN in
the `sql` field are gated by human approval out-of-the-box, with no shield
or config required. The postgres shield correctly upgrades these from review
→ block since shield rules are inserted before advisory rules in getConfig().

Includes 7 new tests: 4 verifying advisory review fires with no config, 3
verifying the postgres shield overrides to block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: shield set/unset — per-rule verdict overrides + config show

- `node9 shield set <shield> <rule> <verdict>` — override any shield rule's
  verdict without touching config.json. Stored in shields.json under an
  `overrides` key, applied at runtime in getConfig(). Accepts full rule
  name, short name, or operation name (e.g. "drop-table" resolves to
  "shield:postgres:block-drop-table").

- `node9 shield unset <shield> <rule>` — remove an override, restoring
  the shield default.

- `node9 shield status` — now shows each rule's verdict individually,
  with override annotations ("← overridden (was: block)").

- `node9 config show` — new command: full effective runtime config
  including active shields with per-rule verdicts, built-in rules,
  advisory rules, and dangerous words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — allow verdict guard, null assertion, test reliability

- shield set allow now requires --force to prevent silent rule silencing;
  exits 1 with a clear warning and the exact re-run command otherwise
- Remove getShield(name)! non-null assertion in error branch
- Fix mockReturnValue → mockReturnValueOnce to prevent test state leak
- Add missing tests: shield set allow guard (integration), unset no-op,
  mixed-case SQL matching (DROP table, drop TABLE, TRUNCATE table)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — shield override security hardening

- Add isShieldVerdict() type guard; replace manual triple-comparison in
  CLI set command and remove unsafe `verdict as ShieldVerdict` cast
- Add validateOverrides() to sanitize shields.json on read — tampered
  disk content with non-ShieldVerdict values is silently dropped before
  reaching the policy engine
- Fix clearShieldOverride() to be a true no-op (skip disk write) when
  the rule has no existing override
- Add comment to resolveShieldRule() documenting first-match behavior
  for operation-suffix lookup to warn against future naming conflicts
- Tests: fix no-op assertion (assert not written), add isShieldVerdict
  suite, add schema validation tests for tampered overrides, add
  authorizeHeadless test for shield-overridden allow verdict

Note: issue #5 (shield status stdout vs stderr) cannot be fixed here —
the pre-commit hook enforces no new console.log in cli.ts to keep stdout
clean for the JSON-RPC/MCP hook code paths in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address second code review — audit trail, tamper warning, trust boundary

- Export appendConfigAudit() from core.ts; call it from CLI when an allow
  override is written with --force so silenced rules appear in audit.log
- validateOverrides() now emits a stderr warning (with shield/rule detail)
  when an invalid verdict is dropped, making tampering visible to the user
- Add JSDoc to writeShieldOverride() documenting the trust boundary: it is
  a raw storage primitive with no allow guard; callers outside the CLI must
  validate rule names via resolveShieldRule() first; daemon does not expose
  this endpoint
- Tests: add stderr-warning test for tampered verdicts; add cache-
  invalidation test verifying _resetConfigCache() causes allow overrides
  to be re-read from disk (mock) on the next evaluatePolicy() call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: close remaining review gaps — first-match, allow-no-guard, TOCTOU

- Issue 5: add test proving resolveShieldRule first-match-wins behavior
  when two rules share an operation suffix; uses a temporary SHIELDS
  mutation (restored in finally) to simulate the ambiguous catalog case
- Issue 6: add explicit test documenting that writeShieldOverride accepts
  allow verdict without any guard — storage primitive contract, CLI is
  the gatekeeper
- Issue 8: add TOCTOU characterization test showing that concurrent
  writeShieldOverride calls with a stale read lose the first write; makes
  the known file-lock limitation explicit and regression-testable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: spawn daemon via process.execPath to fix ENOENT on Windows (#41)

spawn('node9', ...) fails on Windows because npm installs a .cmd shim,
not a bare executable. Node.js child_process.spawn without { shell: true }
cannot resolve .cmd/.ps1 wrappers.

Replace all three bare spawn('node9', ['daemon'], ...) call sites in
cli.ts with spawn(process.execPath, [process.argv[1], 'daemon'], ...),
consistent with the pattern already used in src/tui/tail.ts:
  - autoStartDaemonAndWait()
  - daemon --openui handler
  - daemon --background handler

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(ci): regression guard + Windows CI for spawn fix (#41)

- Add spawn-windows.test.ts: two static source-guard tests that read
  cli.ts and assert (a) no bare spawn('node9'...) pattern exists and
  (b) exactly 3 spawn(process.execPath, ...) daemon call sites exist.
  Prevents the ENOENT regression from silently reappearing.

- Add .github/workflows/ci.yml: runs typecheck, lint, and npm test on
  both ubuntu-latest and windows-latest on every push/PR to main and dev.
  The Windows runner will catch any spawn('node9'...) regression
  immediately since it would throw ENOENT in integration tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step before tests — integration tests require dist/cli.js

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): remove NODE_ENV=test prefix from npm scripts — Windows compat

'NODE_ENV=test cmd' syntax is Unix-only and fails on Windows with
'not recognized as an internal or external command'.

Vitest sets NODE_ENV=test automatically when running in test mode
(via process.env.VITEST), making the prefix redundant. Remove it from
test, test:watch, and test:ui scripts so they work on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): use cross-platform path assertions in undo.test.ts

Replace hardcoded Unix path separators with path.join() and regex
/[/\\]\.git[/\\]/ so assertions pass on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): cross-platform path and HOME fixes for Windows CI

setup.test.ts: replace hardcoded /mock/home/... constants with
path.join(os.homedir(), ...) so path comparisons match on Windows.
doctor.test.ts: set USERPROFILE=homeDir alongside HOME so
os.homedir() resolves the isolated test directory on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): Windows HOME/USERPROFILE and EBUSY fixes

mcp.integration.test.ts: add makeEnv() helper that sets both HOME
and USERPROFILE so spawned node9 processes resolve os.homedir() to
the isolated test directory on Windows. Add EBUSY guard in cleanupDir
for Windows temp file locking after spawnSync.

protect.test.ts: use path.join(os.homedir(), ...) for mock paths in
setPersistentDecision so existsSpy matches on Windows backslash paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests): propagate HOME as USERPROFILE in check integration tests

runCheck/runCheckAsync now set USERPROFILE=HOME so spawned node9
processes resolve os.homedir() to the isolated test directory on
Windows. Apply the same fix to standalone spawnSync calls using
minimalEnv. Add EBUSY guard in cleanupHome for Windows temp locking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(tests,dlp): four Windows CI fixes

mcp.integration.test.ts: use list_directory instead of write_file for
the no-cwd backward-compat test — write_file triggers git add -A on
os.tmpdir() which can index thousands of files on Windows and ETIMEDOUT.

gemini_integration.test.ts: add path import; replace hardcoded
/mock/home/... paths with path.join(os.homedir(), ...) so existsSpy
matches on Windows backslash paths.

daemon.integration.test.ts: add USERPROFILE=tmpHome to daemon spawn
env so os.homedir() resolves to the isolated shields.json. Add EBUSY
guard in cleanupDir.

dlp.ts: broaden /etc/passwd|shadow|sudoers patterns to
^(?:[a-zA-Z]:)?\/etc\/... so they match Windows-normalized paths like
C:/etc/passwd in addition to Unix /etc/passwd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): address code review findings

ci.yml: add format:check step and Node 22 to matrix (package.json
declares >=18 — both LTS versions should be covered).

check/mcp/daemon integration tests: add makeEnv() helpers for
consistent HOME+USERPROFILE isolation; add console.warn on EBUSY
so temp dir leaks are visible rather than silent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): enforce LF line endings so Prettier passes on Windows

Add endOfLine: lf to .prettierrc so Prettier always checks/writes LF
regardless of OS. Add .gitattributes with eol=lf so Git does not
convert line endings on Windows checkout. Without these, format:check
fails on every file on Windows CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci,tests): align makeEnv signatures and add dist verification

check.integration.test.ts: makeEnv now spreads process.env (same as
mcp and daemon helpers) so PATH, NODE_ENV=test (set by Vitest), and
other inherited vars reach spawned child processes. Standalone
spawnSync calls simplified to makeEnv(tmpHome, {NODE9_TESTING:'1'}).
Remove unused minimalEnv from shield describe block.

ci.yml: add Verify dist artifacts step after build to fail fast with
a clear message if dist/cli.js or dist/index.js are missing. Add
comment explaining NODE_ENV=test / NODE9_TESTING guard coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: interactive terminal approval via /dev/tty (SSE + [A]/[D])

Replaces the broken @inquirer/prompts stdin racer with a /dev/tty-based
approval prompt that works as a Claude Code PreToolUse subprocess:

- New src/ui/terminal-approval.ts: opens /dev/tty for raw keypress I/O,
  acquires CSRF token from daemon SSE, renders ANSI approval card, reads
  [A]/[D], posts decision via POST /decision/{id}. Handles abort (another
  racer won) with cursor/card cleanup and SIGTERM/exit guard.

- Daemon entry shared between browser (GET /wait) and terminal (POST /decision)
  racers: extract registerDaemonEntry() + waitForDaemonDecision() from the
  old askDaemon() so both racers operate on the same pending entry ID.

- POST /decision idempotency: first write wins; second call returns 409
  with the existing decision. Prevents race between browser and terminal
  racers from corrupting state.

- CSRF token emitted on every SSE connection (re-emit existing token, never
  regenerate). Terminal racer acquires it by opening /events and reading
  the first csrf event.

- approvalTimeoutSeconds user-facing config alias (converts to ms);
  raises default timeout from 30s to 120s. Daemon auto-deny timer and
  browser countdown now use the config value instead of a hardcoded constant.

- isTTYAvailable() probe: tries /dev/tty open(); disabled on Windows
  (native popup racer covers that path). NODE9_FORCE_TERMINAL_APPROVAL=1
  bypasses the probe for tmux/screen users.

- Integration tests: CSRF re-emit across two connections, POST /decision
  idempotency (both allow-first and deny-first cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Smart Router — node9 tail as interactive approval terminal

Implements a multi-phase Smart Router architecture so `node9 tail` can
serve as a full approval channel alongside the browser dashboard and
native popup.

Phase 1 — Daemon capability tracking (daemon/index.ts):
- SseClient interface tracks { res, capabilities[] } per SSE connection
- /events parses ?capabilities=input from URL; stored on each client
- broadcast() updated to use client.res.write()
- hasInteractiveClient() exported — true when any tail session is live
- broadcast('add') now fires when terminal approver is enabled and an
  interactive client is connected, not only when browser is enabled

Phase 2 — Interactive approvals in tail (tui/tail.ts):
- Connects with ?capabilities=input so daemon identifies it as interactive
- Captures CSRF token from the 'csrf' SSE event
- Handles init.requests (approvals pending before tail connected)
- Handles add/remove SSE events; maintains an approval queue
- Shows one ANSI card at a time ([A] Allow / [D] Deny) using
  tty.ReadStream raw-mode keypress on fd 0
- POSTs decisions via /decision/{id} with source:'terminal'; 409 is non-error
- Cards clear themselves; next queued request shown automatically

Phase 3 — Racer 3 widened (core.ts):
- Racer 3 guard changed from approvers.browser to
  (approvers.browser || approvers.terminal) so tail participates in the
  race via the same waitForDaemonDecision mechanism as the browser
- Guidance printed to stderr when browser is off:
  "Run `node9 tail` in another terminal to approve."

Phase 4 — node9 watch command (cli.ts):
- New `watch <command> [args...]` starts daemon in NODE9_WATCH_MODE=1
  (no idle timeout), prints a tip about node9 tail, then spawnSync the
  wrapped command

Decision source tracking (all layers):
- POST /decision now accepts optional source field ('browser'|'terminal')
- Daemon stores decisionSource on PendingEntry; GET /wait returns it
- waitForDaemonDecision returns { decision, source }
- Racer 3 label uses actual source instead of guessing from config:
  "User Decision (Terminal (node9 tail))" vs "User Decision (Browser Dashboard)"
- Browser UI sends source:'browser'; tail sends source:'terminal'

Tests:
- daemon.integration.test.ts: 3 new tests for source tracking round-trip
  (terminal, browser, and omitted source)
- spawn-windows.test.ts: updated count from 3 to 4 spawn call sites

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: enable /dev/tty approval card in Claude terminal (hook path)

The check command was passing allowTerminalFallback=false to
authorizeHeadless, which disabled Racer 4 (/dev/tty) in the hook path.
This meant the approval card only appeared in the node9 tail terminal,
requiring the user to switch focus to respond.

Change both call sites (initial + daemon-retry) to true so Racer 4 runs
alongside Racer 3. The [A]/[D] card now appears in the Claude terminal
as well — the user can respond from either terminal, whichever has focus.
The 409 idempotency already handles the race correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prevent background authorizeHeadless from overwriting POST /decision

When POST /decision arrives before GET /wait connects, it sets
earlyDecision on the PendingEntry. The background authorizeHeadless
call (which runs concurrently) could then overwrite that decision in
its .then() handler — visible as the idempotency test getting
'allow' back instead of the posted 'deny'.

Guard: after noApprovalMechanism check, return early if earlyDecision
is already set. First write wins.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: route sendBlock terminal output to /dev/tty instead of stderr

Claude Code treats any stderr output from a PreToolUse hook as a hook
error and fails open — the tool proceeds even when the hook writes a
valid permissionDecision:deny JSON to stdout. This meant git push and
other blocked commands were silently allowed through.

Fix: replace all console.error calls in the block/deny path with
writes to /dev/tty, an out-of-band channel that bypasses Claude Code's
stderr pipe monitoring. /dev/tty failures are caught silently so CI
and non-interactive environments are unaffected.

Add a writeTty() helper in core.ts used for all status messages in
the hook execution path (cloud error, waiting-for-approval banners,
cloud result). Update two integration tests that previously asserted
block messages appeared on stderr — they now assert stderr is empty,
which is the regression guard for this bug.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: don't auto-resolve daemon entries in audit mode

In audit mode, background authorizeHeadless resolves immediately with
checkedBy:'audit'. The .then() handler was setting earlyDecision='allow'
before POST /decision could arrive from browser/tail, causing subsequent
POST /decision calls to get 409 and GET /wait to return 'allow' regardless
of what the user posted.

Audit mode means the hook auto-approves — it doesn't mean the daemon
dashboard should also auto-resolve. Leave the entry alive so browser/tail
can still interact with it (or the auto-deny timer fires).

Fixes source-tracking integration test failures on CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: close /dev/tty fd in finally block to prevent leak on write error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove Racer 4 (/dev/tty card in Claude terminal)

Racer 4 interrupted the AI's own terminal with an approval prompt,
which is wrong on multiple levels:
- The AI terminal belongs to the AI agent, not the human approver
- Different AI clients (Gemini CLI, Cursor, etc.) handle terminals
  differently — /dev/tty tricks are fragile across environments
- It created duplicate prompts when node9 tail was also running

Approval channels should all be out-of-band from the AI terminal:
  1. Cloud/SaaS (Slack, mission control)
  2. Native OS popup
  3. Browser dashboard
  4. node9 tail (dedicated approval terminal)

Remove: Racer 4 block in core.ts, allowTerminalFallback parameter
from authorizeHeadless/_authorizeHeadlessCore and all callers,
isTTYAvailable/askTerminalApproval imports, terminal-approval.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: make hook completely silent — remove all writeTty calls from core.ts

The hook must produce zero terminal output in the Claude terminal.
All writeTty status messages (shadow mode, cloud handshake failure,
waiting for approval, approved/denied via cloud) have been removed.
Also removed the now-unused chalk import and writeTty helper function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source allowlist, CSRF 403 tests, watch error handling

- daemon/index.ts: validate POST /decision source field against allowlist
  ('terminal' | 'browser' | 'native') — silently drop invalid values to
  prevent audit log injection
- daemon.integration.test.ts: add CSRF 403 test (missing token), CSRF 403
  test (wrong token), and invalid source value test — the three most
  important negative tests flagged by code review
- cli.ts: check result.error in node9 watch so ENOENT exits non-zero
  instead of silently exiting 0
- test helper: use fixed string 'echo register-label' instead of
  interpolated echo ${label} (shell injection hygiene in test code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove stderr write from askNativePopup; drop sendDesktopNotification

- native.ts: process.stderr.write in askNativePopup caused Claude Code to
  treat the hook as an error and fail open — removed entirely
- core.ts: sendDesktopNotification called notify-send which routes through
  Firefox on Linux (D-Bus handler), causing spurious browser popups —
  removed the audit-mode notification call and unused import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass cwd through hook→daemon so project config controls browser open

Root cause: daemon called getConfig() without cwd, reading the global
config. If ~/.node9/node9.config.json doesn't exist, approvers default
to true — so browser:false in a project config was silently ignored,
causing the daemon to open Firefox on every pending approval.

Fix:
- cli.ts: pass cwd from hook payload into authorizeHeadless options
- core.ts: propagate cwd through _authorizeHeadlessCore → registerDaemonEntry
  → POST /check body; use getConfig(options.cwd) so project config is read
- daemon/index.ts: extract cwd from POST /check, call getConfig(cwd)
  for browserEnabled/terminalEnabled checks
- native.ts: remove process.stderr.write from askNativePopup (fail-open bug)
- core.ts: remove sendDesktopNotification (notify-send routes through Firefox
  on Linux via D-Bus, causing spurious browser notifications)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: always broadcast 'add' when terminalEnabled — restore tail visibility

After the cwd fix, browserEnabled correctly became false when browser:false
is set in project config. But the broadcast condition gated on
hasInteractiveClient(), which returns false if tail isn't connected at the
exact moment the check arrives — silently dropping entries from tail.

Fix: broadcast whenever browserEnabled OR terminalEnabled, regardless of
client connection state. Tail sees pending entries via the SSE stream's
initial state when it connects, so timing of connection doesn't matter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hybrid security model — local UI always wins the race

- Remove isRemoteLocked: terminal/browser/native racers always participate
  even when approvers.cloud is enabled; cloud is now audit-only unless it
  responds first (headless VM fallback)
- Add decisionSource field to AuthResult so resolveNode9SaaS can report
  which channel decided (native/terminal/browser) as decidedBy in the PATCH
- Fix resolveNode9SaaS: log errors to hook-debug.log instead of silent catch
- Fix tail [A]/[D] keypresses: switch from raw 'data' buffer to readline
  emitKeypressEvents + 'keypress' events — fixes unresponsive cards
- Fix tail card clear: SAVE/RESTORE cursor instead of fragile MOVE_UP(n)
- Add cancelActiveCard so 'remove' SSE event properly dismisses active card
- Fix daemon duplicate browser tab: browserOpened flag + NODE9_BROWSER_OPENED
  env so auto-started daemon and node9 tail don't both open a tab
- Fix slackDelegated: skip background authorizeHeadless to prevent duplicate
  cloud request that never resolves in Mission Control
- Add interactive field to SSE 'add' event so browser-only configs don't
  render a terminal card
- Add dev:tail script that parses JSON PID file correctly
- Add req.on('close') cleanup for abandoned long-poll entries
- Add regression tests for all three bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear CI env var in test to unblock native racer on GitHub Actions

Also make the poll fetch mock respond to AbortSignal so the cloud poll
racer exits cleanly when native wins, preventing test timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — source injection tests, dev:tail safety, CI env guard

- Add explicit source boundary tests: null/number/object are all rejected
  by the VALID_SOURCES allowlist (implementation was already correct)
- Replace kill \$(...) shell expansion in dev:tail with process.kill() inside
  Node.js — removes \$() substitution vulnerability if pid file were crafted
- Add afterEach safety net in core.test.ts to restore VITEST/CI/NODE_ENV
  in case the test crashes before the try/finally block restores them
- Increase slackDelegated timing wait from 200ms to 500ms for slower CI
- Fix section numbering gap: 10 → 11 was left after removing a test (now 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use early return in it.each — Vitest does not pass context to it.each callbacks

Context injection via { skip } works in plain it() but not in it.each(),
where the third argument is undefined. Switch to early return, which is
equivalent since the entire describe block skips when portWasFree is false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): correct YAML indentation in ci.yml — job properties were siblings not children

name/runs-on/strategy/steps were indented 2 spaces (sibling to `test:`)
instead of 4 spaces (properties of the `test:` job). GitHub Actions was
ignoring the custom name template, so checks were reported without the
Node version suffix and the required branch-protection check
"CI / Test (ubuntu-latest, Node 20)" was stuck as "Expected" forever.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add 15s timeout to daemon beforeAll hooks — prevents CI timeout

waitForDaemon(6s) + readSseStream(3s) = 9s minimum; the default Vitest
hookTimeout of 10s is too tight on slow CI runners (Ubuntu, Windows).
All three daemon describe-block beforeAll hooks now declare an explicit
15_000ms timeout to give CI sufficient headroom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add daemonProc.kill() fallback in afterAll cleanup

If `node9 daemon stop` fails or times out, the spawned daemon process
would leak. Added daemonProc?.kill() as a defensive fallback after
spawnSync in all three daemon describe-block afterAll hooks.

The CSRF 403 tests (missing/wrong token) already exist at lines 574-598
and were flagged as absent only because the bot's diff was truncated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): address code review — vi.stubEnv, runtime shape checks, abandon-timer comment

- core.test.ts: replace manual env save/delete/restore with vi.stubEnv +
  vi.unstubAllEnvs() in afterEach. Eliminates the fragile try/finally and
  the risk of coercing undefined to the string "undefined". Adds a KEEP IN
  SYNC comment so future isTestEnv additions are caught immediately.

- daemon.integration.test.ts: replace unchecked `as { ... }` casts in
  idempotency tests with `unknown` + toMatchObject — gives a clear failure
  message if the response shape is wrong instead of silently passing.

- daemon.integration.test.ts: add comment explaining why idempotency tests
  do not need a /wait consumer — the abandon timer only fires when an SSE
  connection closes with pending items; no SSE client connects during
  these tests so entries are safe from eviction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): guard daemonProc.kill() with exitCode check — avoid spurious SIGTERM

Calling daemonProc.kill() unconditionally after a successful `daemon stop`
sends SIGTERM to an already-dead process, which can produce a spurious error
log on some platforms. Only kill if exitCode === null (process still running).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: add @vitest/coverage-v8 — baseline coverage report (PR #0)

Installs @vitest/coverage-v8 and configures coverage in vitest.config.mts.
Adds `npm run test:coverage` script.

Baseline (instrumentable files only — cli.ts and daemon/index.ts are
subprocess-only and cannot be instrumented by v8):

  Overall  67.68% stmts  58.74% branches
  core.ts  62.02% stmts  54.13% branches  ← primary refactor target
  undo.ts  87.01%        80.00%
  shields  97.46%        94.64%
  dlp.ts   94.82%        92.85%
  setup    93.67%        80.92%

This baseline will be used to verify coverage improves (or holds) after
each incremental refactor PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr1): extract src/audit/ and src/config/ from core.ts

Move audit helpers (redactSecrets, appendToLog, appendHookDebug,
appendLocalAudit, appendConfigAudit) to src/audit/index.ts and
move all config types, constants, and loading logic (Config,
SmartRule, DANGEROUS_WORDS, DEFAULT_CONFIG, getConfig, getCredentials,
getGlobalSettings, hasSlack, listCredentialProfiles) to
src/config/index.ts.

core.ts kept as barrel re-exporting from the new modules so all
existing importers (cli.ts, daemon/index.ts, tests) are unchanged.
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove trailing blank lines in core.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(pr2): extract src/policy/ and src/utils/regex from core.ts

Move the entire policy engine to src/policy/index.ts:
  evaluatePolicy, explainPolicy, shouldSnapshot, evaluateSmartConditions,
  checkDangerousSql, isIgnoredTool, matchesPattern and all private
  helpers (tokenize, getNestedValue, extractShellCommand, analyzeShellCommand).

Move ReDoS-safe regex utilities to src/utils/regex.ts:
  validateRegex, getCompiledRegex — no deps on config or policy,
  consumed by both policy/ and cli.ts via core.ts barrel.

core.ts is now ~300 lines (auth + daemon I/O only).
All 509 tests pass, typecheck and lint are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: restore dev branch to push trigger

CI should run on direct pushes to dev (merge commits, dependency
bumps, etc.), not just on PRs. Flagged by two independent code
review passes on the coverage PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add timeouts to execSync and spawnSync to prevent CI hangs

- doctor command: add timeout:3000 to execSync('which node9') and
  execSync('git --version') — on slow CI machines these can block
  indefinitely and cause the 5000ms vitest test timeout to fire
- runDoctor test helper: add timeout:15000 to spawnSync so the subprocess
  has enough headroom on slow CI without hitting the vitest timeout
- removefrom test loop: increase spawnSync timeout 5000→15000 and add
  result.error assertion for better failure diagnostics on CI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract src/auth/ from core.ts

Split the authorization race engine out of core.ts into 4 focused modules:
- auth/state.ts  — pause, trust sessions, persistent decisions
- auth/daemon.ts — daemon PID check, entry registration, long-polling
- auth/cloud.ts  — SaaS handshake, poller, resolver, local-allow audit
- auth/orchestrator.ts — multi-channel race engine (authorizeHeadless)

core.ts is now a 40-line backwards-compat barrel. 509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address code review — coverage thresholds + undici vuln

- vitest.config.mts: add coverage thresholds at current baseline (68%
  stmts, 58% branches, 66% funcs, 70% lines) so CI blocks regressions.
  Add json-summary reporter for CI integration. Exclude core.ts (barrel,
  no executable code) and ui/native.ts (OS UI, untestable in CI).
- package.json: pin undici to ^7.24.0 via overrides to resolve 6 high
  severity vulnerabilities in dev deps (@semantic-release, @actions).
  Remaining 7 vulns are in npm-bundled packages (not fixable without
  upgrading npm itself) and dev-only tooling (eslint, handlebars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: enforce coverage thresholds in CI pipeline

Add coverage step to CI workflow that runs vitest --coverage on
ubuntu/Node 22 only (avoids matrix cost duplication). Thresholds
configured in vitest.config.mts will fail the build if coverage drops
below baseline, closing the gap flagged in code review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix double test run — merge coverage into single test step

Replace the two-step (npm test + npm run test:coverage) pattern with a
single conditional: ubuntu/Node 22 runs test:coverage (enforces
thresholds), all other matrix cells run npm test. No behaviour change,
half the execution time on the primary matrix cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: extract proxy, negotiation, and duration from cli.ts

- src/proxy/index.ts — runProxy() MCP/JSON-RPC stdio interception
- src/policy/negotiation.ts — buildNegotiationMessage() AI block messages
- src/utils/duration.ts — parseDuration() human duration string parser
- cli.ts: 2088 → 1870 lines, now imports from focused modules

509 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract shield, check, log commands into focused modules

Moves registerShieldCommand, registerConfigShowCommand, registerCheckCommand,
and registerLogCommand into src/cli/commands/. Extracts autoStartDaemonAndWait
and openBrowserLocal into src/cli/daemon-starter.ts.

cli.ts drops from ~1870 to ~1120 lines. Unused imports removed. Spawn
Windows regression test updated to cover the moved autoStartDaemonAndWait
call site in daemon-starter.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): use string comparison for matrix.node and document coverage intent

GHA expression matrix values are strings; matrix.node == 22 (integer) silently
fails, so coverage never ran on any cell. Fixed to matrix.node == '22'.

Added comments to ci.yml explaining the intentional single-cell threshold
enforcement (branch protection must require the ubuntu/Node 22 job), and
to vitest.config.mts explaining the baseline date and target trajectory.

Also confirmed: npm ls undici shows 7.24.6 everywhere — no conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): replace fragile matrix ternary with dedicated coverage job

Removes the matrix.node == '22' ternary from the test matrix. Coverage now
runs in a standalone 'coverage' job (ubuntu/Node 22 only) that can be
required by name in branch protection — no risk of the job name drifting
or the selector silently failing.

Also adds a comment to tsup.config.ts documenting why devDependency coverage
tooling (@vitest/coverage-v8, @rolldown/*) cannot leak into the production
bundle (tree-shaking — nothing in src/ imports them).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add build step and NODE9_TESTING=1 to coverage job; bump to v1.2.0

Coverage job was missing npm run build, causing integration tests to fail
with "dist/cli.js not found". Also adds NODE9_TESTING=1 env var to prevent
native popup dialogs and daemon auto-start during coverage runs in CI.

Version bumped to 1.2.0 to reflect the completed modular refactor
(core.ts + cli.ts split into focused single-responsibility modules).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add npm audit --omit=dev --audit-level=high to test job

Audits production deps on every CI run. Scoped to --omit=dev because
known CVEs in flatted (eslint chain) and handlebars (semantic-release chain)
are devDep-only and never ship in the production bundle. Production tree
currently shows 0 vulnerabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(cli): extract doctor, audit, status, daemon, watch, undo commands

Moves 6 remaining large commands into src/cli/commands/:
  doctor.ts    — health check (165 lines, owns pass/fail/warn helpers)
  audit.ts     — audit log viewer with formatRelativeTime
  status.ts    — current mode/policy/pause display
  daemon-cmd.ts — daemon start/stop/openui/background/watch
  watch.ts     — watch mode subprocess runner
  undo.ts      — snapshot diff + revert UI

cli.ts: 1,141 → 582 lines. Unused imports (execSync, spawnSync, undo funcs,
getCredentials, DAEMON_PORT/HOST) removed. spawn-windows regression test
updated to cover the new module locations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(daemon): split 1026-line index.ts into state, server, barrel

daemon/state.ts  (360 lines) — all shared mutable state, types, utility
  functions, SSE broadcast, Flight Recorder Unix socket, and the
  abandonPending / hadBrowserClient / abandonTimer accessors needed to
  avoid direct ES module let-export mutation across file boundaries.

daemon/server.ts (668 lines) — startDaemon() HTTP server and all route
  handlers (/check, /wait, /decision, /events, /settings, /shields, etc.).
  Imports everything it needs from state.ts; no circular dependencies.

daemon/index.ts  (58 lines) — thin barrel: re-exports public API
  (startDaemon, stopDaemon, daemonStatus, DAEMON_PORT, DAEMON_HOST,
  DAEMON_PID_FILE, DECISIONS_FILE, AUDIT_LOG_FILE, hasInteractiveClient).

Also fixes two startup console.log → console.error (stdout must stay
clean for MCP/JSON-RPC per CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): handle ENOTEMPTY in cleanupDir on Windows CI

Windows creates system junctions (AppData\Local\Microsoft\Windows)
inside any directory set as USERPROFILE, making rmSync fail with
ENOTEMPTY even after recursive deletion. These junctions are harmless
to leak from a temp dir; treat them the same as EBUSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add NODE9_TESTING=1 to test job for consistency with coverage

Without it, spawned child processes in the test matrix could trigger
native popups or daemon auto-start. Matches the coverage job which
already set this env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): raise npm audit threshold to moderate

Node9 sits on the critical path of every agent tool call — a
moderate-severity prod vuln (e.g. regex DoS in a request parser)
is still exploitable in this context. 0 vulns at moderate level
confirmed before raising the bar.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add missing coverage for auth/state, timeout racer, and daemon unknown-ID

- auth-state.test.ts (new): 18 tests covering checkPause (all branches
  including expired file auto-delete and indefinite expiry), pauseNode9,
  resumeNode9, getActiveTrustSession (wildcard, prune, malformed JSON),
  writeTrustSession (create, replace, prune expired entries)
- core.test.ts: timeout racer test — approvalTimeoutMs:50 fires before any
  other channel, returns approved:false with blockedBy:'timeout'
- daemon.integration.test.ts: POST /decision with unknown UUID → 404
- vitest.config.mts: raise thresholds to match new baseline
  (statements 68→70, branches 58→60, functions 66→70, lines 70→71)

auth/state.ts coverage: 30% → 96% statements, 28% → 89% branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin @vitest/coverage-v8 to exact version 4.1.2

RC transitive deps (@rolldown/binding-* at 1.0.0-rc.12) are pulled in
via coverage-v8. Pinning prevents silent drift to a newer RC that could
change instrumentation behaviour or introduce new RC-stage transitive deps.

Also verified: obug@2.1.1 is a legitimate MIT-licensed debug utility
from the @vitest/sxzz ecosystem — not a typosquat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add needs: [test] to coverage job

Prevents coverage from producing a misleading green check when the test
matrix fails. Coverage now only runs after all test jobs pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): raise Vitest timeouts for slow CI tests

- cloud denies: approvalTimeoutMs:3000 means the check process runs ~3s
  before the mock cloud responds; default 5s Vitest limit was too tight.
  Raised to 15s.
- doctor 'All checks passed': spawns a subprocess that runs `ss` for
  port detection — slow on CI runners. Raised to 20s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(deps): pin vitest to 4.1.2 to match @vitest/coverage-v8

Both packages must stay in sync — a peer version mismatch causes silent
instrumentation failures. Pinning both to the same exact version prevents
drift when ^ would otherwise allow vitest to bump independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MCP gateway — transparent stdio proxy for any MCP server

Adds `node9 mcp-gateway --upstream <cmd>` which wraps any MCP server
as a transparent stdio proxy. Every tools/call is intercepted and run
through the full authorization engine (DLP, smart rules, shields,
human approval) before being forwarded to the upstream server.

Key implementation details:
- Deferred exit: authPending flag prevents process.exit() while auth
  is in flight, so blocked-tool responses are always flushed first
- Deferred stdin end: mirrors the same pattern for child.stdin.end()
  so approved messages are written before stdin is closed
- Approved writes happen inside the try block, before finally runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address code review feedback

- Explicit ignoredTools in allowed-tool test (no implicit default dep)
- Assert result.status === 0 in all success-case tests (null = timeout)
- Throw result.error in runGateway helper so timeout-killed process fails
- Catch ENOTEMPTY in cleanupDir alongside EBUSY (Windows junctions)
- Document parseCommandString is shell-split only, not shell execution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): id validation, resilience tests, review fixes

- Validate JSON-RPC id is string|number|null; return -32600 for object/array ids
- Add resilience tests: invalid upstream JSON forwarded as-is, upstream crash
- Fix runGateway() to accept optional upstreamScript param
- Add status assertions to all blocked-tool tests
- Document parseCommandString safety in mcp-gateway source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): fix shell tokenizer to handle quoted paths with spaces

Replace execa's parseCommandString (which did not handle shell quoting)
with a custom tokenizer that strips double-quotes and respects backslash
escapes. Adds 4 review-driven test improvements: mock upstream silently
drops notifications, runGateway guards killed-by-signal status, shadowed
variable renamed, DLP test builds credential at runtime, upstream path
with spaces test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #4 — hermetic env, null-status guard, new tests

- Filter all NODE9_* env vars in runGateway so local developer config
  (NODE9_MODE, NODE9_API_KEY, NODE9_PAUSED) cannot pollute test isolation
- Fix status===null guard: no longer requires result.signal to also be set;
  a killed/timed-out process always throws, preventing silent false passes
- Extract parseResponses() helper to eliminate repeated JSON.parse cast pattern
- Reduce default runGateway timeout from 8s → 5s for fast-path tests
- Add test: malformed (non-JSON) input is forwarded to upstream unchanged
- Add test: tools/call notification (no id) is forwarded without gateway response
- Add comments: mock upstream uses unbuffered stdout.write; /tmp/test.txt path
  is safe because mock never reads disk; NODE9_TESTING only disables UI approvers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #5 — isolation, typing, timeouts, README note

- afterAll: log cleanup failures to stderr instead of silently swallowing them
- runGateway: document PATH is safe (all spawns use absolute paths); expand
  NODE9_TESTING comment to reference exact source location of what it suppresses
- Replace /tmp/test.txt with /nonexistent/node9-test-only so intent is unambiguous
- Tighten blocked-tool test timeout: 5000ms → 2000ms (approvalTimeoutMs=100ms,
  so a hung auth engine now surfaces as a clear failure rather than a late pass)
- GatewayResponse.result: add explicit tools/ok fields so Array.isArray assertion
  has accurate static type information
- README: add note clarifying --upstream takes a single command string (tokenizer
  splits it); explain double-quoted paths for paths with spaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mcp-gateway): address review #6 — diagnostics, type safety, error handling

- Timeout error now includes partial stdout/stderr so hung gateway failures
  are diagnosable instead of silently discarding the output buffer
- Mock upstream catch block writes to stderr instead of empty catch {} so
  JSON-RPC parse errors surface in test output rather than causing a hang
- parseResponses wraps JSON.parse in try/catch and rethrows with the
  offending line, replacing cryptic map-thrown errors with useful context
- GatewayResponse.result: replace redundant Record<string,unknown> & {..…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants