Skip to content

feat(security): add tamper-evident audit chain logger#892

Open
gemini2026 wants to merge 294 commits intoNVIDIA:mainfrom
gemini2026:feat/audit-chain
Open

feat(security): add tamper-evident audit chain logger#892
gemini2026 wants to merge 294 commits intoNVIDIA:mainfrom
gemini2026:feat/audit-chain

Conversation

@gemini2026
Copy link
Copy Markdown

@gemini2026 gemini2026 commented Mar 25, 2026

Closes #891

What this does

Adds a tamper-evident audit logger under nemoclaw/src/security/. It writes JSONL entries where each entry includes a SHA-256 hash of the previous one — modifying or deleting any entry breaks the chain downstream.

Comes with verifyChain() to validate an entire log file, plus exportEntries() and tailEntries() for querying.

Three new files, no changes to existing NemoClaw code.

How the chain works

Each entry has: seq (monotonic), chain_id (random hex per instance), prev_hash, entry_hash, type, time, data. The logger resumes from an existing file by reading the last entry for continuation state.

verifyChain catches: modified data, broken hash links, sequence gaps (block deletion), and chain_id mismatches (splice attacks).

Test plan

  • 38 tests passing (npx vitest run src/security/audit-chain.test.ts)
  • tsc --noEmit clean
  • All pre-commit hooks pass
  • Tamper scenarios: modified data, broken prev_hash, deleted entries, chain splicing
  • Edge cases: malformed JSONL lines, empty files, single-entry chains, 100-entry stress test

Signed-off-by: Charan Jagwani cjagwani@nvidia.com

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a tamper‑evident append‑only JSONL audit chain: new AuditLogger that resumes state, SHA‑256 chained entries, verification and query helpers (verifyChain, exportEntries, tailEntries), plus tests and reference documentation.

Changes

Cohort / File(s) Summary
Implementation
nemoclaw/src/security/audit-chain.ts
New module implementing append‑only JSONL audit chain, SHA‑256 entry hashing (entry_hash), chain_id generation, resume semantics, AuditLogger, verifyChain, exportEntries, and tailEntries.
Tests
nemoclaw/src/security/audit-chain.test.ts
Comprehensive Vitest suite covering initialization, chaining, resume behavior, tamper scenarios (modified data, broken prev_hash, sequence gaps, malformed lines, splices), API validation errors, export/tail semantics, and high‑volume writes.
Documentation
docs/reference/audit-chain.md
New reference doc describing entry format, chain rules, API surface, TypeScript interfaces, examples, and operational notes.
sequenceDiagram
    autonumber
    participant Client as Client
    participant Logger as AuditLogger
    participant FS as FileSystem
    participant Verifier as verifyChain
    Client->>Logger: log(type, data)
    Logger->>FS: append JSONL entry (seq, chain_id, prev_hash, entry_hash, ...)
    FS-->>Logger: write ack
    Client->>Verifier: verifyChain(path)
    Verifier->>FS: read JSONL lines
    FS-->>Verifier: return lines
    Verifier->>Verifier: parse lines, validate seq, chain_id, prev_hash, recompute entry_hash
    Verifier-->>Client: VerifyResult (valid / error, entries)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through lines of JSON bright,
I bind each hop with SHA‑256 light,
A tiny chain that never lies,
I guard the tale with watchful eyes,
Append, resume—truth held tight.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(security): add tamper-evident audit chain logger' directly and clearly describes the primary change—a new security module for audit logging with tamper detection.
Linked Issues check ✅ Passed The pull request fully implements all coding requirements from issue #891: hash-chained JSONL audit logger with AuditLogger class, verifyChain/exportEntries/tailEntries functions, proper entry structure, append-only persistence, resume on startup, comprehensive Vitest tests, and documentation.
Out of Scope Changes check ✅ Passed All changes are scoped to the three new files specified in issue #891 (audit-chain.ts, audit-chain.test.ts, audit-chain.md); no modifications to existing NemoClaw code or unrelated areas.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
docs/reference/audit-chain.md (3)

39-40: One sentence per line.

Lines 39–40 contain two sentences on the same line.
Split them for cleaner diffs.

The `chain_id` is a random 24-character hex string generated when the logger creates a new file.
It remains constant across all entries in a single chain and persists when the logger resumes from an existing file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/audit-chain.md` around lines 39 - 40, The line describing
`chain_id` currently contains two sentences on the same line—split it into two
separate lines so each sentence is its own line: one line stating "The
`chain_id` is a random 24-character hex string generated when the logger creates
a new file." and the next line stating "It remains constant across all entries
in a single chain and persists when the logger resumes from an existing file."
so diffs are clean.

85-86: One sentence per line.

Multiple sentences appear on a single line here.
Split each sentence onto its own line for readable diffs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/audit-chain.md` around lines 85 - 86, The documentation line
describing the constructor currently contains multiple sentences on one line;
split each sentence into its own line so diffs are readable—e.g., separate "The
constructor creates parent directories if they do not exist." and "If the file
already contains entries, the logger resumes the chain from the last entry."
into two lines under the constructor documentation.

113-114: One sentence per line.

Lines 113–114 and 119–120 each have two sentences on the same line.
Split them for cleaner diffs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/audit-chain.md` around lines 113 - 114, Split the two
combined-sentence lines so each sentence is on its own line: change the line
that currently contains "When `limit` is omitted or zero, all matching entries
are returned. Malformed lines are silently skipped." into two lines ("When
`limit` is omitted or zero, all matching entries are returned." and "Malformed
lines are silently skipped.") and do the same for the other occurrence around
lines 119–120; ensure each sentence is a separate line to keep diffs clean.
nemoclaw/src/security/audit-chain.test.ts (1)

558-578: Test name "concurrent writes" is misleading.

This test performs rapid sequential writes from a single thread, not concurrent writes from multiple threads.
Since appendFileSync is synchronous, the writes are serialized.
Consider renaming to "rapid sequential writes" or similar to accurately reflect what's being tested.

True concurrency testing would require multiple worker threads writing simultaneously.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/security/audit-chain.test.ts` around lines 558 - 578, The test
title "concurrent writes" is misleading because it performs serialized writes
using AuditLogger (which uses appendFileSync); rename the describe or it block
to accurately reflect the behavior (e.g., change describe("concurrent writes")
or it("handles rapid sequential writes without data loss") to "rapid sequential
writes" or similar) so the test name matches the implementation, referencing the
AuditLogger, appendFileSync, readEntries, and verifyChain symbols to locate the
test to update.
nemoclaw/src/security/audit-chain.ts (2)

346-350: Consider a structured logging approach instead of console.error.

Using console.error during readTailState can produce unexpected output in production or library consumers.
Options:

  1. Return a diagnostic in the TailState result for the caller to handle.
  2. Accept an optional logger callback.
  3. Silently skip (current behavior minus the log).

Given this is a security module where malformed lines could indicate tampering, silent skipping may be appropriate since verifyChain handles detection separately.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/security/audit-chain.ts` around lines 346 - 350, The
console.error in readTailState should be removed and replaced with a structured
diagnostic approach: update the readTailState signature to either accept an
optional logger callback (e.g., logger?: (level: string, msg: string, meta?:
any) => void) or add a diagnostic field on the returned TailState (e.g.,
diagnostics: Array<{line: string, error: string}>), then use that mechanism
instead of console.error when skipping a malformed line; ensure callers (or
verifyChain) can inspect diagnostics if needed and that default behavior remains
silent if no logger is provided.

110-113: No file locking for multi-process writes.

appendFileSync provides atomicity at the OS level for small writes, but concurrent appends from multiple processes can interleave or corrupt entries on some platforms.

If multi-process logging is a future requirement, consider:

  • Advisory file locking (flock equivalent)
  • A separate log aggregator process
  • A database-backed alternative

For single-process use as currently tested, this is fine.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/security/audit-chain.ts` around lines 110 - 113, The current
append in AuditChain.write (appendFileSync(this.path, ...)) can interleave or
corrupt when multiple processes write; replace the direct append with a
multi-process-safe approach such as acquiring an advisory lock before writing
(e.g., flock via a native binding or fs-ext's flockSync), or implement a
dedicated log-writer/aggregator process or DB-backed store; ensure the code
acquires the lock on this.path (or on a separate .lock file), performs the
append of JSON.stringify(entry)+"\n", updates this.seq and this.prevHash only
after a successful write, and always releases the lock even on error.
docs/reference/injection-scanner.md (1)

23-25: One sentence per line.

Lines 23-24 contain multiple sentences on single lines. Per coding guidelines, each sentence should be on its own line to make diffs readable.

 The injection scanner detects prompt injection patterns in agent tool inputs and outputs.
-It applies text normalization and pattern matching to identify attempts to override system prompts, inject instructions, manipulate tools, or exfiltrate data.
+It applies text normalization and pattern matching to identify attempts to override system prompts, inject instructions, manipulate tools, or exfiltrate data.

(The second sentence is already on its own line, but verify the source has one sentence per line throughout.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/injection-scanner.md` around lines 23 - 25, The two sentences
in the README paragraph should be split so each sentence is on its own line:
replace the single line containing "The injection scanner detects prompt
injection patterns in agent tool inputs and outputs. It applies text
normalization and pattern matching to identify attempts to override system
prompts, inject instructions, manipulate tools, or exfiltrate data." with two
lines — one containing "The injection scanner detects prompt injection patterns
in agent tool inputs and outputs." and a second containing "It applies text
normalization and pattern matching to identify attempts to override system
prompts, inject instructions, manipulate tools, or exfiltrate data."; also scan
the rest of the document to ensure every sentence follows the
one-sentence-per-line guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/reference/injection-scanner.md`:
- Around line 113-115: The documentation for maxSeverity incorrectly states its
return type as `Severity | ""`; update the doc to match the implementation and
tests by changing the return type to `Severity | null` and the description to
say it returns null for an empty findings array; reference the maxSeverity
function and the existing test that asserts null to ensure consistency.

---

Nitpick comments:
In `@docs/reference/audit-chain.md`:
- Around line 39-40: The line describing `chain_id` currently contains two
sentences on the same line—split it into two separate lines so each sentence is
its own line: one line stating "The `chain_id` is a random 24-character hex
string generated when the logger creates a new file." and the next line stating
"It remains constant across all entries in a single chain and persists when the
logger resumes from an existing file." so diffs are clean.
- Around line 85-86: The documentation line describing the constructor currently
contains multiple sentences on one line; split each sentence into its own line
so diffs are readable—e.g., separate "The constructor creates parent directories
if they do not exist." and "If the file already contains entries, the logger
resumes the chain from the last entry." into two lines under the constructor
documentation.
- Around line 113-114: Split the two combined-sentence lines so each sentence is
on its own line: change the line that currently contains "When `limit` is
omitted or zero, all matching entries are returned. Malformed lines are silently
skipped." into two lines ("When `limit` is omitted or zero, all matching entries
are returned." and "Malformed lines are silently skipped.") and do the same for
the other occurrence around lines 119–120; ensure each sentence is a separate
line to keep diffs clean.

In `@docs/reference/injection-scanner.md`:
- Around line 23-25: The two sentences in the README paragraph should be split
so each sentence is on its own line: replace the single line containing "The
injection scanner detects prompt injection patterns in agent tool inputs and
outputs. It applies text normalization and pattern matching to identify attempts
to override system prompts, inject instructions, manipulate tools, or exfiltrate
data." with two lines — one containing "The injection scanner detects prompt
injection patterns in agent tool inputs and outputs." and a second containing
"It applies text normalization and pattern matching to identify attempts to
override system prompts, inject instructions, manipulate tools, or exfiltrate
data."; also scan the rest of the document to ensure every sentence follows the
one-sentence-per-line guideline.

In `@nemoclaw/src/security/audit-chain.test.ts`:
- Around line 558-578: The test title "concurrent writes" is misleading because
it performs serialized writes using AuditLogger (which uses appendFileSync);
rename the describe or it block to accurately reflect the behavior (e.g., change
describe("concurrent writes") or it("handles rapid sequential writes without
data loss") to "rapid sequential writes" or similar) so the test name matches
the implementation, referencing the AuditLogger, appendFileSync, readEntries,
and verifyChain symbols to locate the test to update.

In `@nemoclaw/src/security/audit-chain.ts`:
- Around line 346-350: The console.error in readTailState should be removed and
replaced with a structured diagnostic approach: update the readTailState
signature to either accept an optional logger callback (e.g., logger?: (level:
string, msg: string, meta?: any) => void) or add a diagnostic field on the
returned TailState (e.g., diagnostics: Array<{line: string, error: string}>),
then use that mechanism instead of console.error when skipping a malformed line;
ensure callers (or verifyChain) can inspect diagnostics if needed and that
default behavior remains silent if no logger is provided.
- Around line 110-113: The current append in AuditChain.write
(appendFileSync(this.path, ...)) can interleave or corrupt when multiple
processes write; replace the direct append with a multi-process-safe approach
such as acquiring an advisory lock before writing (e.g., flock via a native
binding or fs-ext's flockSync), or implement a dedicated log-writer/aggregator
process or DB-backed store; ensure the code acquires the lock on this.path (or
on a separate .lock file), performs the append of JSON.stringify(entry)+"\n",
updates this.seq and this.prevHash only after a successful write, and always
releases the lock even on error.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a1657469-f7fc-48b6-b329-fb8a4cbfad82

📥 Commits

Reviewing files that changed from the base of the PR and between b2164e7 and d20cb17.

⛔ Files ignored due to path filters (2)
  • nemoclaw/package-lock.json is excluded by !**/package-lock.json
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (10)
  • .github/PULL_REQUEST_TEMPLATE.md
  • .github/workflows/pr.yaml
  • Dockerfile
  • docs/reference/audit-chain.md
  • docs/reference/injection-scanner.md
  • install.sh
  • nemoclaw/src/security/audit-chain.test.ts
  • nemoclaw/src/security/audit-chain.ts
  • nemoclaw/src/security/injection-scanner.test.ts
  • nemoclaw/src/security/injection-scanner.ts
💤 Files with no reviewable changes (1)
  • .github/workflows/pr.yaml

Comment on lines +113 to +115
### `maxSeverity(findings: Finding[]): Severity | ""`

The highest severity level present in the findings array, or an empty string if the array is empty.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Documentation-code mismatch: maxSeverity returns null, not "".

The documented return type Severity | "" does not match the implementation, which returns Severity | null. The test at injection-scanner.test.ts:623-625 confirms maxSeverity([]) returns null via .toBeNull(). Callers following this documentation would incorrectly check for an empty string instead of null.

📝 Proposed fix
-### `maxSeverity(findings: Finding[]): Severity | ""`
+### `maxSeverity(findings: Finding[]): Severity | null`

-The highest severity level present in the findings array, or an empty string if the array is empty.
+The highest severity level present in the findings array, or `null` if the array is empty.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### `maxSeverity(findings: Finding[]): Severity | ""`
The highest severity level present in the findings array, or an empty string if the array is empty.
### `maxSeverity(findings: Finding[]): Severity | null`
The highest severity level present in the findings array, or `null` if the array is empty.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/injection-scanner.md` around lines 113 - 115, The
documentation for maxSeverity incorrectly states its return type as `Severity |
""`; update the doc to match the implementation and tests by changing the return
type to `Severity | null` and the description to say it returns null for an
empty findings array; reference the maxSeverity function and the existing test
that asserts null to ensure consistency.

@wscurran wscurran added security Something isn't secure priority: high Important issue that should be resolved in the next release enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. labels Mar 31, 2026
@wscurran
Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this PR with a detailed summary, it proposes a new security feature to improve audit logging and prevent tampering.

@gemini2026
Copy link
Copy Markdown
Author

gemini2026 commented Mar 31, 2026

This pairs with the injection scanner (#870) and session tracker (#965) — together they add application-layer security on top of the container isolation.

Also fixed the H1 to match the NemoClaw naming convention.

gemini2026 added a commit to gemini2026/NemoClaw that referenced this pull request Apr 2, 2026
- Remove dead code: `classifyRisk()` count > 2 branch can never be
  reached because trifecta is checked first and there are exactly
  three capability classes — simplify to a ternary
- Add `onTrifecta` callback to `SessionStore` constructor so callers
  can log a warning, emit a metric, or terminate the session when
  risk escalates; callback fires once per session on first trifecta
- Add `clear()` method to release all tracked state; prevents
  unbounded memory growth in long-running processes
- Consolidate duplicate event-cap / boundary-condition test suites
  into a single test covering all boundary behaviors
- Add tests for `onTrifecta` (fires once, per session, not partial)
  and `clear()` (resets state, allows new sessions)
- Fix passive voice in docs: "are dropped" → "The tracker drops",
  "are silently ignored" → "The method silently ignores",
  "are not deduplicated" → "The tracker does not deduplicate"
- Replace colon with em dash in exfiltration chain sentence
- Add justification for 100-event cap (~10 KB per session)
- Document in-memory-only limitation explicitly
- Annotate Next Steps cross-refs with pending PR numbers (NVIDIA#870/NVIDIA#892)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gemini2026 added a commit to gemini2026/NemoClaw that referenced this pull request Apr 2, 2026
- Wrap onTrifecta callback in try/catch so a broken handler never
  disrupts recording — the store state is already mutated by the time
  the callback fires, so propagating would leave callers in an
  inconsistent state
- Replace {doc} cross-references with plain-text file paths to avoid
  Sphinx build errors while injection-scanner (NVIDIA#870) and audit-chain
  (NVIDIA#892) pages are still pending
- Add "NemoClaw" prefix to title.page and H1 to match the naming
  convention used by other reference pages (NemoClaw Architecture,
  NemoClaw Network Policies, etc.)
- Add test: getExposure returns null after clear() — catches partial-
  reset bugs that wipe capabilities but not the event log
- Add test: onTrifecta fires again after clear + re-record — verifies
  the callback is not permanently suppressed for a session ID
- Add test: throwing onTrifecta callback is swallowed without error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cjagwani cjagwani self-assigned this Apr 10, 2026
@miyoungc
Copy link
Copy Markdown
Contributor

Docs review

The doc content is accurate — every API surface, interface, default, and behavioral claim matches the implementation in audit-chain.ts. Nice work on the thoroughness.

A few style guide items to fix before merge:

1. Frontmatter description needs main/agent structure

The current frontmatter uses a flat description: string, but all existing reference pages and the contributing guide template use the nested format with main and agent fields. The agent field with "Use when..." triggers is what the docs-to-skills pipeline relies on for routing.

Current:

description: "Reference for the tamper-evident audit chain module..."

Should be:

description:
  main: "Reference for the tamper-evident audit chain module that writes SHA-256 hash-chained JSONL entries and provides verify, export, and tail utilities."
  agent: "References for the tamper-evident audit chain module that writes SHA-256 hash-chained JSONL entries and provides verify, export, and tail utilities. Use when looking up audit chain entry format, hash chaining behavior, or the verify/export/tail API."

2. Em dash in title

The style guide says "Do not use em dashes." The title currently has one:

NemoClaw Audit Chain — Tamper-Evident Hash-Chained Audit Log

The existing reference pages use colons (e.g., NemoClaw Architecture: Plugin, Blueprint, and Sandbox Structure). To stay consistent with those:

NemoClaw Audit Chain: Tamper-Evident Hash-Chained Audit Log

Update both title.page and the H1 heading to match.

3. Passive voice (2 occurrences)

The style guide requires active voice. Two lines under exportEntries and tailEntries say:

Malformed lines are silently skipped.

Change to:

The function silently skips malformed lines.

or simply:

Skips malformed lines.

4. MyST cross-reference in generated skill file

.agents/skills/nemoclaw-user-reference/references/audit-chain.md line 159 has:

- See {doc}`/reference/architecture` for how security modules integrate into the NemoClaw plugin.

The {doc} directive is Sphinx syntax that does not render in plain markdown (which is how agents read skill files). The docs-to-skills script should rewrite these as skill-to-skill pointers. If this file was generated by the pre-commit hook and the cross-ref was not converted, that is a script issue. If the file was created manually, it should not have been — the contributing guide says to never edit generated files under .agents/skills/nemoclaw-user-*/.

Either way, the Next Steps line in docs/reference/audit-chain.md is the source. After fixing it there, re-run the skill generation to pick up the corrected output:

python scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user

@gemini2026
Copy link
Copy Markdown
Author

All four fixed in 0db9334:

  1. Frontmatter — split into main/agent fields
  2. Em dash → colon in title.page and H1
  3. Passive voice → active ("Skips malformed lines")
  4. Cross-ref — {doc} → relative link in docs source; re-ran docs-to-skills.py which converted it to a skill pointer in the generated file

Let me know if anything else needs a pass.

@gemini2026
Copy link
Copy Markdown
Author

All currently raised feedback on this branch should be addressed now.

Latest maintainer review is approved and checks are green on the current head.

@brandonpelfrey when you have a moment, could you please take a final pass?

@wscurran
Copy link
Copy Markdown
Contributor

Thanks for the audit chain logger work. The codebase has changed significantly since March 25 — including a full TypeScript migration — so this will need a rebase on origin/main before we can review it. You also have a related PR in #870 (prompt injection scanner) — coordinating the rebase across both would be helpful. Note that #916 covers similar audit logging ground from a different contributor, so we'll need to reconcile the two approaches once both are rebased. Please rebase and resolve any conflicts, and we'll take a look.

Really appreciate the dedication and support for NemoClaw!

@wscurran wscurran added the status: rebase PR needs to be rebased against main before review can continue label Apr 14, 2026
ericksoa and others added 4 commits April 14, 2026 20:03
…VIDIA#1112)

* fix(install): ensure p-retry is available for npm install -g from source

`npm install -g .` from a fresh clone creates a symlink (equivalent to
`npm link`) and does NOT install dependencies. This causes `require('p-retry')`
to fail at runtime since node_modules doesn't exist locally.

- Add dep bootstrap to `prepare` script: installs production deps if missing,
  which runs during `npm link` / `npm install -g .` but is a no-op during
  normal `npm install` (deps already present)
- Add `bundleDependencies` for tarball-based install paths (npm pack/publish)

Fixes regression from NVIDIA#1051.

* fix(install): always install production deps in prepare script

Address review feedback: remove conditional node_modules/p-retry check
and unconditionally run npm install --omit=dev --ignore-scripts. This is
a no-op when deps are already present and ensures they resolve in the
global install case.

---------

Co-authored-by: Carlos Villela <cvillela@nvidia.com>
…ing sandbox build (NVIDIA#419)

* fix: auto-create swap on low-memory VMs to prevent OOM during sandbox build

On DigitalOcean Ubuntu 24.04 droplets (and similar VMs) with 8 GB RAM
and no swap, the sandbox image push during onboarding gets OOM-killed
(exit 137).

Changes:
- Add getMemoryInfo() and ensureSwap() to bin/lib/preflight.js
- Integrate memory/swap check into onboard preflight step (onboard.js)
- Add equivalent swap check to scripts/setup.sh
- Add unit tests for getMemoryInfo and ensureSwap
- Document OOM troubleshooting in docs/reference/troubleshooting.md

The preflight now detects low memory (<6 GB RAM+swap) and auto-creates
a 4 GB swap file via sudo. If swap creation fails, it warns the user
with manual steps instead of hard-failing.

* fix: increase memory threshold for swap creation to 12000MB

* Update bin/lib/onboard.js

prevent misleading "Memory OK" logs when RAM is low but an existing swapfile is safely providing coverage

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Update scripts/setup.sh

align the MIN_TOTAL_MB constant with the 12000MB threshold enforced in the JS preflight checks

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* fix: harden swap creation with size checks, interactive prompt, dd, and proper uninstaller cleanup

* fix: address CodeRabbit feedback for swap cleanup and inactive checks

* fix(preflight): address swap management feedback

* fix(preflight): fix swap cleanup ordering and clean up comments

* fix(preflight): move require('path') and require('os') to top-level imports

* fix: address CodeRabbit feedback on preflight tests and uninstaller verification

* fix(preflight): harden swap check to handle reactivation of orphaned swap files

* fix: add swapfileExists false to the test options

ensures test is deterministic regardless of CI host state after proposed fix from review

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
…n Linux (NVIDIA#821)

* fix(install): configure npm for user-local installs to avoid permission issues on Linux

* fix(install): correct formatting in path addition to user shell configuration files

---------

Co-authored-by: Carlos Villela <cvillela@nvidia.com>
cv and others added 27 commits April 14, 2026 20:03
…IDIA#1707)

## Summary

The gate checker and triage scripts treated "all present checks green"
as passing, even when only 2 of ~9 checks existed. This caused premature
approvals on fork PRs where workflows hadn't been triggered yet.

### Root cause

Fork PRs from first-time contributors need a maintainer to click
"Approve and run" before `pull_request` workflows execute. Until then,
only `pull_request_target` checks (`check-pr-limit`) and external bots
(`CodeRabbit`) appear in `statusCheckRollup`. The scripts saw 2/2 green
and reported CI as passing.

A secondary bug: GitHub's `statusCheckRollup` returns two shapes —
`CheckRun` (`name`/`status`/`conclusion`) and `StatusContext`
(`context`/`state`). The scripts only read CheckRun fields, so
CodeRabbit (a StatusContext) was always treated as "pending" even when
`state` was `SUCCESS`.

### Changes

- **`check-gates.ts`**: Add `REQUIRED_CHECK_NAMES` (`checks`,
`commit-lint`, `dco-check`) validation. Add `StatusCheck` union type to
correctly handle both `CheckRun` and `StatusContext` shapes. CI gate now
fails with `"required check(s) not found — workflows may need approval"`
when expected checks are absent.
- **`triage.ts`**: Add same required-check validation so triage does not
score unapproved-workflow PRs as `review-ready`.
- **`MERGE-GATE.md`**: Add "Missing required checks" as first bullet in
Step 2 interpretation guidance.

### Before / After

| PR scenario | Before | After |
|---|---|---|
| Fork PR, workflows not approved (2 checks) | "All 2 checks green" ✅ |
"3 required check(s) not found — workflows may need approval" ❌ |
| Fork PR, workflows running, dco-check failing | "1 pending"
(CodeRabbit misread) | "3 failing check(s): dco-check: FAILURE, ..." ❌ |
| Internal PR, all 12 checks green | "1 pending" (CodeRabbit misread) |
"All 12 checks green" ✅ |

### Test plan

- [x] Verified against PR NVIDIA#1660 (fork, workflows not approved) —
correctly reports missing checks
- [x] Verified against PR NVIDIA#1663 (fork, workflows approved, dco-check
failing) — correctly reports failures
- [x] Verified against PR NVIDIA#1683 (internal, all green) — correctly
reports all 12 green
- [x] Triage script correctly classifies NVIDIA#1660 as `salvage-now` with
`failing-checks` reason instead of `review-ready`

Signed-off-by: Carlos Villela <cvillela@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Enhanced merge gate to require specific CI checks be present and
completed before a PR can be approved; missing required checks will
block approval until workflows finish and validation is re-run.
* Improved CI evaluation to better distinguish pending vs failed states
across different check types.
  * PRs missing required check contexts are now classified as not-green.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
Fixes incorrect per-sandbox model/provider reporting in `nemoclaw list`
and `nemoclaw status`.

This change persists sandbox inference metadata at registration time and
stops inventory/status output from substituting the gateway's live
inference route for the default sandbox. The result is that sandbox
listings now consistently reflect each sandbox's stored configuration.

## Related Issue
Fixes NVIDIA#1689

## Changes
- Pass `model` and `provider` into `registerSandbox()` during onboarding
so sandbox metadata is persisted immediately.
- Update `listSandboxesCommand()` to always display the stored
per-sandbox `model` and `provider`.
- Update `showStatusCommand()` to always display the stored per-sandbox
`model`.
- Add and update regression tests covering:
  - registration-time persistence of `model`/`provider`
- list/status behavior when live gateway inference differs from stored
sandbox config
  - CLI output behavior for stored sandbox inference metadata

## Type of Change
- [x] Code change for a new feature, bug fix, or refactor.
- [ ] Code change with doc updates.
- [ ] Doc only. Prose changes without code sample modifications.
- [ ] Doc only. Includes code sample changes.

## Testing
- [ ] `npx prek run --all-files` passes (or equivalently `make check`).
- [ ] `npm test` passes.
- [ ] `make docs` builds without warnings. (for doc-only changes)

Targeted verification completed:
- Reproduced the issue from the reported behavior and confirmed two
contributing problems:
- sandbox registration could leave `model`/`provider` unset in the
registry
- inventory/status preferred live gateway inference for the default
sandbox
- Added regression tests first and demonstrated they fail against the
pre-fix behavior:
- `npx vitest run src/lib/inventory-commands.test.ts
test/registry.test.js`
- Applied the fix and confirmed the updated regression tests pass:
- `npx vitest run src/lib/inventory-commands.test.ts
test/registry.test.js test/cli.test.js`

Known unrelated local blockers while running full hooks in this
checkout:
- ESLint CLI hook fails because `@typescript-eslint/parser` is missing
locally.
- Existing `test/skills-frontmatter.test.js` failures are unrelated to
this change.

## Checklist

### General

- [x] I have read and followed the [contributing
guide](https://github.com/NVIDIA/NemoClaw/blob/main/CONTRIBUTING.md).
- [ ] I have read and followed the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md).
(for doc-only changes)

### Code Changes
- [x] Formatters applied — `npx prek run --all-files` auto-fixes
formatting (or `make format` for targeted runs).
- [x] Tests added or updated for new or changed behavior.
- [x] No secrets, API keys, or credentials committed.
- [ ] Doc pages updated for any user-facing behavior changes (new
commands, changed defaults, new features, bug fixes that contradict
existing docs).

### Doc Changes
- [ ] Follows the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md).
Try running the `nemoclaw-contributor-update-docs` agent skill to draft
changes while complying with the style guide. For example, prompt your
agent with "`/nemoclaw-contributor-update-docs` catch up the docs for
the new changes I made in this PR."
- [ ] New pages include SPDX license header and frontmatter, if creating
a new page.
- [ ] Cross-references and links verified.

---
Signed-off-by: Brandon Pelfrey <bpelfrey@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Bug Fixes
* Sandbox model and provider settings now consistently display their
configured values instead of being overridden by live gateway values,
ensuring more predictable and reliable inference configuration display
across list and status commands.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…DIA#1583) (NVIDIA#1660)

## Summary
- Move the hardcoded `github` network_policies entry out of the base
sandbox policy and into a new `presets/github.yaml` so GitHub access
(github.com, api.github.com, `git`, `gh`) is only granted when the user
explicitly opts in during `nemoclaw onboard`.
- Add regression tests asserting (a) the base policy never re-declares a
`github` entry or references github.com / api.github.com, and (b) the
new preset file exists and parses.

Fixes NVIDIA#1583.

## Why
The base policy hardcoded `network_policies.github` with `access: full`
plus the `git` and `gh` binaries. Because the entry lived in the base
policy, every sandbox received unscoped GitHub access regardless of
which policy presets the user picked. The `Available policy presets`
list during onboard never showed a `github` option, so the permission
was neither user-discoverable nor user-controllable — agents could `git
clone`, `git push`, and call the GitHub API on behalf of any sandbox
without consent. This violates least-privilege and is exactly the kind
of silent capability creep the preset system exists to prevent.

## What changed
- `nemoclaw-blueprint/policies/openclaw-sandbox.yaml` — remove the
`github` entry, leave a comment pointing at the new preset and NVIDIA#1583.
- `nemoclaw-blueprint/policies/presets/github.yaml` — new file. Same
shape as `presets/brew.yaml`. Preserves the original `access: full`
semantics so users who opt in keep the existing capability set.
- `test/validate-blueprint.test.ts` — two new regression tests under
`NVIDIA#1583`.

## Test plan
- [x] `npx vitest run test/validate-blueprint.test.ts` → 29/29 passing.
- [x] Negative-case verification: stash the base-policy edit and re-run
— the new regression test fails as expected, then passes again after
pop.
- [ ] Maintainer should sanity-check that no internal flow inside the
sandbox depends on `git`/`gh` being available out of the box. If
something does (e.g., a skill that clones from GitHub), it should either
declare the github preset as a dependency or be reworked to use the
gateway.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added an optional GitHub preset to enable GitHub.com and GitHub API
access (including git/gh) when selected.

* **Refactor**
* Removed GitHub network/binary access from default sandbox; access now
enabled only via the optional preset.

* **Tests**
* Added tests ensuring GitHub access is absent by default and the GitHub
preset is correctly configured and listed.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Colin McDonough <cmcdonough@50words.com>

---------

Co-authored-by: Carlos Villela <cvillela@nvidia.com>
## Summary

- Adds `--dangerously-skip-permissions` flag to `nemoclaw onboard` and
`nemoclaw <name> connect` that applies a maximally permissive sandbox
policy
- Creates `openclaw-sandbox-permissive.yaml` with `include_workdir:
true` and all known endpoints using `access: full` (CONNECT tunnel, no
L7 filtering)
- When used on onboard: uses the permissive base policy and skips preset
selection entirely
- When used on connect: applies the permissive policy to a running
sandbox on the fly
- Also accepts `NEMOCLAW_DANGEROUSLY_SKIP_PERMISSIONS=1` env var for
CI/CD

## Test plan

- [ ] `nemoclaw onboard --dangerously-skip-permissions
--non-interactive` — verify warning printed, permissive policy applied,
presets skipped
- [ ] `openshell policy get --full <sandbox>` — confirm permissive YAML
active
- [ ] `nemoclaw <sandbox> connect --dangerously-skip-permissions` —
verify policy updated on running sandbox
- [ ] `nemoclaw <sandbox> status` — verify "dangerously-skip-permissions
(open)" shown
- [ ] `make check` and `npm test` pass (no new failures beyond
pre-existing main issues)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a permissive sandbox mode and a new sandbox policy file enabling
broader filesystem and network access.
* Added a CLI flag to enable the permissive mode for onboarding and
connecting; prints a prominent security warning when used.
* Registry now records when a sandbox runs in permissive mode so status
output reflects this.

* **Tests**
  * Updated onboarding tests to account for the permissive-mode option.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…IDIA#1709)

## Summary
- Add `onboarding_step.json` to the Brev CLI setup in the E2E workflow
- Without it, `brev ls` triggers an interactive onboarding tutorial that
hangs on the CI runner waiting for stdin

## Root Cause
Follow-up to NVIDIA#1639. The credentials file fix restored auth, but `brev
ls` still hangs because the Brev CLI's first-run onboarding wizard
blocks on stdin. This file was also removed in 374a847 (NVIDIA#1470).

## Evidence
- Stuck run:
[24213469934](https://github.com/NVIDIA/NemoClaw/actions/runs/24213469934/job/70688110743)
— "Install Brev CLI" step hangs indefinitely
- Local test with clean HOME + both files: `brev ls` returns immediately

## Test plan
- [ ] Merge to main, trigger `e2e-brev` with `TEST_SUITE=full` — confirm
`brev ls` passes and tests actually run

Fixes NVIDIA#1638

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
* Improved CI/workflow configuration to prevent onboarding prompts from
interfering with automated authentication checks.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
<!-- markdownlint-disable MD041 -->
## Summary
<!-- 1-3 sentences: what this PR does and why. -->
Argumentizing the OpenClaw version in the base image build so it can be
overridden at build time via `--build-arg` or workflow dispatch, while
keeping the default version. This is the first step toward an OpenClaw
version integration pipeline that validates how different OpenClaw
releases adapt to NemoClaw.

## Related Issue
<!-- Link to the issue: Fixes #NNN or Closes #NNN. Remove this section
if none. -->
Closes NVIDIA#1525

## Changes
<!-- Bullet list of key changes. -->
- **`Dockerfile.base`**: Replace the hardcoded `openclaw@2026.3.11`
install with an `ARG OPENCLAW_VERSION` (default `2026.3.11`). At build
time, the provided version is validated against:
1. The `min_openclaw_version` declared in
`nemoclaw-blueprint/blueprint.yaml`.
2. The npm registry, to confirm the version actually exists before
attempting install.
- **`.github/workflows/base-image.yaml`**: Add an `openclaw_version`
input to `workflow_dispatch` and forward it as a `build-args` entry to
the Docker build step, falling back to `2026.3.11` when left blank.

## Type of Change
<!-- Check the one that applies. -->
- [x] Code change for a new feature, bug fix, or refactor.
- [ ] Code change with doc updates.
- [ ] Doc only. Prose changes without code sample modifications.
- [ ] Doc only. Includes code sample changes.

## Testing
<!-- What testing was done? -->
- [ ] `npx prek run --all-files` passes (or equivalently `make check`).
- [ ] `npm test` passes.
- [ ] `make check` passes.

## Checklist

### General

- [ ] I have read and followed the [contributing
guide](https://github.com/NVIDIA/NemoClaw/blob/main/CONTRIBUTING.md).
- [ ] I have read and followed the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md).
(for doc-only changes)

### Code Changes
<!-- Skip if this is a doc-only PR. -->
- [ ] Formatters applied — `npx prek run --all-files` auto-fixes
formatting (or `make format` for targeted runs).
- [ ] Tests added or updated for new or changed behavior.
- [ ] No secrets, API keys, or credentials committed.
- [ ] Doc pages updated for any user-facing behavior changes (new
commands, changed defaults, new features, bug fixes that contradict
existing docs).

---
<!-- DCO sign-off (required by CI). Replace with your real name and
email. -->
Signed-off-by: Hung Le <hple@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Allow manual override of the OpenClaw version when running the
workflow.

* **Improvements**
* Validate provided OpenClaw versions for correct format, ensure they
meet blueprint minimums, and confirm availability before use.
* Only apply an explicit OpenClaw version to builds when one is
supplied.

* **Tests**
* Updated tests to derive and verify fallback OpenClaw version
dynamically.

* **Chores**
  * Improved version detection infrastructure.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

- Allow model and provider changes without rebuilding the sandbox image
- The entrypoint patches `openclaw.json` at startup when
`NEMOCLAW_MODEL_OVERRIDE` is set, then recomputes the config hash so
integrity checks still pass
- Same pattern as `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` (PR NVIDIA#1620)

### New env vars

| Env var | Purpose | When needed |
|---------|---------|-------------|
| `NEMOCLAW_MODEL_OVERRIDE` | Override `agents.defaults.model.primary`
and provider model name | Any model switch |
| `NEMOCLAW_INFERENCE_API_OVERRIDE` | Override inference API type
(`openai-completions` or `anthropic-messages`) | Cross-provider switches
only |

### Usage example (NVIDIA → Anthropic)

```bash
# On host: configure gateway route
openshell inference set --provider anthropic-prod --model claude-sonnet-4-6 --no-verify

# Set env vars for the sandbox (via openshell or Docker)
export NEMOCLAW_MODEL_OVERRIDE="anthropic/claude-sonnet-4-6"
export NEMOCLAW_INFERENCE_API_OVERRIDE="anthropic-messages"

# Restart sandbox — no image rebuild needed
```

### Security

- Env vars come from the host (Docker/OpenShell), not from inside the
sandbox
- Config integrity is verified first (detects build-time tampering),
then override is applied
- Config hash is recomputed after patching
- Landlock locks the file after this function runs
- Agent cannot set these env vars

## Related Issue

Closes NVIDIA#759

## Test plan

- [ ] `npm test` passes (39 tests in nemoclaw-start.test.js)
- [ ] Set `NEMOCLAW_MODEL_OVERRIDE` → sandbox starts with overridden
model
- [ ] Unset env var → sandbox starts with original baked model (no
regression)
- [ ] Set invalid model → sandbox starts but inference fails (expected)
- [ ] Config hash passes integrity check on restart after override

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Env vars to override the active model and (optionally) the inference
API at container startup; when used in privileged startup, the runtime
config is updated and its integrity hash recomputed so startup
verification aligns.

* **Runtime safeguards**
* Input validation, API allowlist, symlink protections, applies only in
privileged mode, and no-op behavior when unset.

* **Tests**
* New unit and end-to-end tests covering override behavior, timing, hash
recomputation, validation, and noop cases.

* **Documentation**
* Guidance added for cross-provider switching and the runtime override
workflow.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
…DIA#1631)

## Summary
- Skip the `beforeAll` onboard when `TEST_SUITE=full` — the full test
immediately destroys the sandbox and rebuilds from scratch via
`install.sh --non-interactive`
- Other suites (`credential-sanitization`, `telegram-injection`, `all`)
are unaffected — they still get the `beforeAll` sandbox
- Expected: `TEST_SUITE=full` drops from ~19 min to ~13-14 min (~5 min
saved)

## Evidence

From [run
23957694836](https://github.com/NVIDIA/NemoClaw/actions/runs/23957694836/job/69879936380):

| Phase | Duration | Notes |
|-------|----------|-------|
| beforeAll onboard | 358s (~6 min) | Builds sandbox image, exports 1.1
GB, uploads to gateway |
| test-full-e2e.sh Phase 0 | ~14s | **Immediately destroys that
sandbox** |
| test-full-e2e.sh Phase 2 | ~291s | Runs `install.sh --non-interactive`
which creates its own |

The beforeAll onboard is redundant for the `full` suite.

## Test plan
- [ ] Run `TEST_SUITE=full` E2E — confirm all 17 checks still PASS
- [ ] Run `TEST_SUITE=credential-sanitization` E2E — confirm beforeAll
onboard still runs
- [ ] Compare total duration to baseline (~19 min → ~13-14 min expected)

Fixes NVIDIA#1629

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Added a Vitest end-to-end suite for provisioning Brev VMs, running
remote test scripts, and validating sandbox/workflow behaviors across
multiple scenarios.
* Suite gates execution on required environment variables and local CLI
authentication, handles automated setup/onboarding, readiness checks,
and conditional cleanup (optional keep-alive).
  * Validates test outputs for PASS/FAIL expectations.
  * Updated regression guard to reference the new test file format.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
…$PATH leakage (NVIDIA#1706)

The "node missing" / "npm missing" runtime preflight tests need a PATH
where the host's real `node` and `npm` are NOT visible, so the error
branches are actually exercised. The previous `TEST_SYSTEM_PATH =
"/usr/bin:/bin"` literal leaks `/usr/bin/node` on any Linux distribution
that installs Node via `apt install nodejs` (i.e. most of them). On
those hosts the affected tests assert the wrong code path — they expect
"node missing" but the preflight finds the system `/usr/bin/node` and
reports a version mismatch instead.

This replaces the literal with `buildIsolatedSystemPath()`, a small
helper that creates a tmpdir under `os.tmpdir()` at module load and
symlinks every entry from `/usr/bin` and `/bin` into it — except `node`,
`npm`, and `npx`, which are deliberately excluded.

Rebased onto the renamed .ts file from the TS migration stack.

Closes NVIDIA#1621 (sub-item 2 of three).

<!-- markdownlint-disable MD041 -->
## Summary
<!-- 1-3 sentences: what this PR does and why. -->

## Related Issue
<!-- Link to the issue: Fixes #NNN or Closes #NNN. Remove this section
if none. -->

## Changes
<!-- Bullet list of key changes. -->

## Type of Change
<!-- Check the one that applies. -->
- [ ] Code change for a new feature, bug fix, or refactor.
- [ ] Code change with doc updates.
- [ ] Doc only. Prose changes without code sample modifications.
- [ ] Doc only. Includes code sample changes.

## Testing
<!-- What testing was done? -->
- [ ] `npx prek run --all-files` passes (or equivalently `make check`).
- [ ] `npm test` passes.
- [ ] `make docs` builds without warnings. (for doc-only changes)

## Checklist

### General

- [ ] I have read and followed the [contributing
guide](https://github.com/NVIDIA/NemoClaw/blob/main/CONTRIBUTING.md).
- [ ] I have read and followed the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md).
(for doc-only changes)

### Code Changes
<!-- Skip if this is a doc-only PR. -->
- [ ] Formatters applied — `npx prek run --all-files` auto-fixes
formatting (or `make format` for targeted runs).
- [ ] Tests added or updated for new or changed behavior.
- [ ] No secrets, API keys, or credentials committed.
- [ ] Doc pages updated for any user-facing behavior changes (new
commands, changed defaults, new features, bug fixes that contradict
existing docs).

### Doc Changes
<!-- Skip if this PR has no doc changes. -->
- [ ] Follows the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md).
Try running the `nemoclaw-contributor-update-docs` agent skill to draft
changes while complying with the style guide. For example, prompt your
agent with "`/nemoclaw-contributor-update-docs` catch up the docs for
the new changes I made in this PR."
- [ ] New pages include SPDX license header and frontmatter, if creating
a new page.
- [ ] Cross-references and links verified.




<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Improved test isolation for preflight/runtime checks by using an
isolated system PATH to avoid interference from host-provided tools.
* Made test setup more robust when preparing the isolated PATH by
handling filesystem errors during population.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: TSavo <TSavo@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Brandon Pelfrey <bpelfrey@nvidia.com>
…VIDIA#1649)

<!-- markdownlint-disable MD041 -->
## Summary

The sandbox base image (`ghcr.io/nvidia/nemoclaw/sandbox-base`) is
missing the `gnupg` package — `gpg --list-keys` (and any other gpg
invocation) fails with `bash: gpg: command not found` inside the
sandbox. This adds a single pinned `gnupg=2.2.40-1.1+deb12u2` line to
the existing `apt-get install` block in `Dockerfile.base`, restoring the
binary that the rest of the codebase already assumes is present.

## Related Issue

Closes NVIDIA#1640.

## Changes

`Dockerfile.base`: add `gnupg=2.2.40-1.1+deb12u2` to the existing
`apt-get install` block, slotted right after `git`. Same
`--no-install-recommends`, same cleanup tail, same `=<version>` pinning
style as every other package in the block.

```diff
        curl=7.88.1-10+deb12u14 \
        git=1:2.39.5-0+deb12u3 \
+       gnupg=2.2.40-1.1+deb12u2 \
        ca-certificates=20230311+deb12u1 \
```

The pinned version is the bookworm-stable `2.2.40-1.1+deb12u2`, verified
by `apt-cache madison gnupg` against the exact base image SHA
`node:22-slim@sha256:4f77a690...`. The package brings in `dirmngr`,
`gpg-wks-server`, and `gpg-wks-client` as dependencies. Total layer cost
~3 MB compressed.

Diff: **+1 / 0** in 1 file.

### Why this is the right fix (and not "lower the env var" or "remove
the test")

The fix isn't obvious unless you trace where `GNUPGHOME` came from.
Walking that chain:

1. **PR NVIDIA#1121** (`fix(sandbox): restrict /sandbox to read-only via
Landlock (NVIDIA#804)`, authored by @prekshivyas, merged 2026-04-08) made the
`/sandbox` home directory Landlock-read-only to prevent agents from
modifying their own runtime environment.
2. To keep tools that normally write under `~/...` working (gpg, git
config, python history, npm prefix, etc.), that PR redirected each
tool's homedir to a writable `/tmp/...` path via env vars in
`scripts/nemoclaw-start.sh`. The relevant line is at
`scripts/nemoclaw-start.sh:53`:
   ```sh
   'GNUPGHOME=/tmp/.gnupg'
   ```
alongside `HISTFILE=/tmp/.bash_history`,
`GIT_CONFIG_GLOBAL=/tmp/.gitconfig`, `PYTHONUSERBASE=/tmp/.local`, etc.
3. PR NVIDIA#1121 also added three matching assertions in
`test/service-env.test.js` (lines 177, 191, 347) verifying that the
redirect is set:
   ```js
   expect(src).toContain("GNUPGHOME=/tmp/.gnupg");
   ```
4. **What PR NVIDIA#1121 didn't do**: add `gnupg` to the `apt-get install`
list in `Dockerfile.base`. The env var setup landed and the test
assertions landed, but the install line was missed.
5. CI never noticed because `service-env.test.js` only asserts that the
env var is *set* in the source — it never spawns a subprocess that
actually runs `gpg`. So a working test suite + a missing binary coexist
silently. The QA report (this issue, NVIDIA#1640) catches it as a runtime
failure on DGX Spark aarch64 because their test step does invoke `gpg
--list-keys`.

The clear intent of NVIDIA#1121 was to **enable** gpg under a redirected
`GNUPGHOME` — you wouldn't redirect the homedir if you wanted gpg
blocked. This PR is the matching install line that NVIDIA#1121 should have
included, closing a one-line oversight rather than adding new capability
or rolling anything back.

### Why not just remove the GNUPGHOME redirect

The env var redirect from NVIDIA#1121 is doing real work — without it, any
future `apt-get install gnupg` would still leave gpg unable to write to
its homedir under Landlock-read-only `/sandbox`. The redirect is the
"right" half of the pair; the install is the missing left half.

### Why this isn't a security regression

The sandbox runs LLM-driven agents and gpg is a credential-handling
tool, so it's worth justifying explicitly:

- The redirected `GNUPGHOME=/tmp/.gnupg` is **fresh and empty** per
session — no preloaded keys.
- Without keys, gpg can hash/check signatures of public material but
cannot decrypt or sign anything.
- An agent would have to first import a key (which requires the user to
provide it — keys are not pulled from anywhere automatically) before gpg
becomes capable of any sensitive operation.
- This is the same threat model as `git` and `curl`, which are already
in the image and could equally be used to fetch arbitrary content. gpg
adds no new capability that the existing toolchain doesn't already have.

If the project explicitly *did* want gpg unavailable to agents, the
right fix would be to remove the GNUPGHOME redirect from NVIDIA#1121 *and* the
matching test assertions, not to keep the env wiring while leaving the
binary missing — that's just confusing.

## Type of Change

- [x] Code change for a new feature, bug fix, or refactor.
- [ ] Code change with doc updates.
- [ ] Doc only. Prose changes without code sample modifications.
- [ ] Doc only. Includes code sample changes.

## Testing

Smoke-tested locally by building `Dockerfile.base` with the fix and
running the exact failing command from the bug report:

```sh
$ docker build -f Dockerfile.base -t nemoclaw-base-test:gnupg .
[...]
=> exporting to image  46.7s done

$ docker run --rm nemoclaw-base-test:gnupg gpg --version
gpg (GnuPG) 2.2.40
libgcrypt 1.10.1

$ docker run --rm nemoclaw-base-test:gnupg gpg --list-keys
gpg: directory '/root/.gnupg' created
gpg: keybox '/root/.gnupg/pubring.kbx' created
gpg: /root/.gnupg/trustdb.gpg: trustdb created
(exit 0)

# And with the runtime-redirected GNUPGHOME from nemoclaw-start.sh:
$ docker run --rm -e GNUPGHOME=/tmp/.gnupg nemoclaw-base-test:gnupg \
    sh -c 'mkdir -p /tmp/.gnupg && chmod 700 /tmp/.gnupg && gpg --list-keys'
gpg: keybox '/tmp/.gnupg/pubring.kbx' created
(exit 0)
```

Both the default `~/.gnupg` and the runtime-redirected `/tmp/.gnupg`
(matching what `nemoclaw-start.sh` exports) work as expected. The exact
`gpg --list-keys` failure from the bug report no longer reproduces.

- [x] `hadolint Dockerfile.base` — clean (no warnings)
- [x] `docker build -f Dockerfile.base` — succeeds, exports to image
cleanly
- [x] `gpg --version` in built image — works (`gpg (GnuPG) 2.2.40`)
- [x] `gpg --list-keys` in built image — works (was `bash: gpg: command
not found` before this PR)
- [x] `gpg --list-keys` with `GNUPGHOME=/tmp/.gnupg` — works (matches
the runtime env from `nemoclaw-start.sh`)
- [ ] `npx prek run --all-files` — partial: ran the affected hooks
(commitlint, gitleaks, hadolint) which all pass; did NOT run `test-cli`
against the full local suite because two pre-existing baseline failures
on stock `main` get in the way on a WSL2 dev host (the
`shouldPatchCoredns` issue addressed by PR NVIDIA#1626 (merged) and the
install-preflight PATH leakage addressed by PR NVIDIA#1628 (open)). Upstream
CI runs on Linux GHA runners and doesn't hit either of those, so it'll
exercise the full suite normally.
- [ ] `npm test` — same caveat as above, ran the relevant projects in
isolation
- [ ] `make docs` builds without warnings. (for doc-only changes — N/A)

## Checklist

### General

- [x] I have read and followed the [contributing
guide](https://github.com/NVIDIA/NemoClaw/blob/main/CONTRIBUTING.md).
- [ ] I have read and followed the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md).
(for doc-only changes — N/A)

### Code Changes

- [x] Formatters applied — `hadolint Dockerfile.base` clean. No
JS/TS/Python files touched.
- [x] Tests added or updated for new or changed behavior — N/A. The
existing `service-env.test.js` already asserts the `GNUPGHOME` redirect
introduced in NVIDIA#1121; this PR makes the corresponding binary available so
those assertions reflect a runtime that actually works. A new test that
spawns `gpg` directly inside a container would arguably be worth a
follow-up (it would have caught this gap originally), but it's a
separate concern from this one-line install fix.
- [x] No secrets, API keys, or credentials committed.
- [ ] Doc pages updated for any user-facing behavior changes — N/A. The
bug report describes the expected behavior; this PR just makes runtime
match it. No docs claim gpg is unavailable.

### Doc Changes

- N/A (no doc changes)

---
Signed-off-by: T Savo <evilgenius@nefariousplan.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
  * Base system image now includes GnuPG as a pinned OS package.

* **Bug Fixes / Security**
* GnuPG runtime directory is now created in a separate step with
stricter permissions and sandbox ownership when applicable, reducing
exposure.

* **Tests**
* Test suite updated to verify the new directory creation and
permission/ownership behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: T Savo <evilgenius@nefariousplan.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Prekshi Vyas <34834085+prekshivyas@users.noreply.github.com>
…1438) (NVIDIA#1655)

## Summary

Closes NVIDIA#1438.

`nemoclaw-blueprint/blueprint.yaml` referenced the sandbox image by the
mutable `:latest` tag:

```yaml
image: "ghcr.io/nvidia/openshell-community/sandboxes/openclaw:latest"
```

A registry compromise or accidental force-push to `:latest` would
silently swap the sandbox image without any blueprint-side change. The
top-level `digest:` field in the same file is documented as *"Computed
at release time"* but is empty in main, so there's no integrity backstop
today.

## Fix

Pin the image by digest:

```yaml
image: "ghcr.io/nvidia/openshell-community/sandboxes/openclaw@sha256:b3d832b596ab6b7184a9dcb4ae93337ca32851a4f93b00765cc12de26baa3a9a"
```

The digest above was the resolved sha256 of `:latest` at the time of
this commit, fetched directly from
`ghcr.io/v2/nvidia/openshell-community/sandboxes/openclaw/manifests/latest`
via the standard OCI registry API.

A new regression test in `test/validate-blueprint.test.ts` walks the
parsed blueprint, finds `components.sandbox.image`, and asserts:
- the image string contains `@sha256:` (digest pin present)
- the image string does NOT match a `:latest` suffix
- the digest is a 64-hex-char sha256

## Test plan

- [x] `vitest run test/validate-blueprint.test.ts` — 28/28 pass
- [x] **Negative case verified**: stashed the YAML fix, re-ran the test
against the unfixed `:latest` blueprint. The new `regression NVIDIA#1438` test
correctly **fails** with a `digest pin missing` assertion error, proving
the test catches the regression.
- [x] **Positive case verified**: re-applied the YAML fix, re-ran. All
28 tests pass.

## Honest scope and follow-up needed

This PR is a "stop the bleeding now" fix. It closes the immediate
supply-chain hole but creates a small maintenance burden: the static
digest needs to be bumped whenever a new sandbox image is published.

**Recommended follow-up (out of scope here):** add release tooling (CI
workflow or `scripts/update-blueprint-digest.sh`) that resolves the
current `:latest` digest from `ghcr.io` and rewrites both the `image:`
line and the top-level `digest:` field in `blueprint.yaml` on every
release. The infrastructure for this doesn't exist today (the top-level
`digest:` field's "Computed at release time" comment describes intent
that was never implemented).

Until that tooling lands, contributors bumping the sandbox image will
need to update this digest by hand. The new regression test makes that
hand-update obvious — anyone who lets it lapse back to `:latest` or any
other mutable tag fails CI.

## Why digest pinning over version tags

The `ghcr.io/nvidia/openshell-community/sandboxes/openclaw` registry
currently exposes only commit-SHA tags (e.g. `21aa171`, `436b8c9`) and
`latest`. There is no semver tag like `v0.0.10` to pin to. A digest is
the only integrity-preserving reference available without changing how
sandbox images are published.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Switched the sandbox container reference to an immutable image digest
and aligned the blueprint's top-level digest for reproducible
deployments; added inline notes to keep them synchronized.

* **Tests**
* Added regression tests to enforce immutable sandbox image references
and to ensure the blueprint-level digest matches the sandbox image
digest.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: ColinM-sys <cmcdonough@50words.com>

---------

Signed-off-by: ColinM-sys <cmcdonough@50words.com>
## Summary

- Introduces an `agents/` directory structure for agent-specific
artifacts (Dockerfiles, startup scripts, network policies, plugins)
- Adds agent selection step to the onboard wizard (`--agent hermes` or
interactive prompt)
- Includes a working Hermes Agent integration as the first non-OpenClaw
agent

## What this enables

NemoClaw can now orchestrate different AI agents inside OpenShell
sandboxes. The onboard flow lets users choose which agent to run, and
routes to the correct container image, network policy, and agent setup
based on that choice.

## Test plan

- [ ] `nemoclaw onboard --agent hermes` completes end-to-end
- [ ] `nemoclaw onboard` (no flag) shows agent selection prompt,
defaults to OpenClaw
- [ ] Hermes health endpoint responds inside sandbox
- [ ] OpenClaw path is unaffected (regression)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added support for Hermes Agent (Nous Research) as an alternative
sandbox option alongside OpenClaw.
* Introduced `--agent <name>` flag to `nemoclaw onboard` for selecting
the desired agent during setup.
* Agent selection is now persisted in user sessions and sandbox
configurations.

* **Updates**
  * Onboarding flow expanded to 8 steps to accommodate agent selection.
* Gateway health monitoring and recovery now adapt based on the selected
agent.
  * Updated messaging in warnings and logs to be agent-agnostic.

* **Tests**
  * Added comprehensive end-to-end tests for Hermes Agent integration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
## Summary

- \`ensure_nemoclaw_shim()\` creates a self-referential shell script
when \`npm_config_prefix=\$HOME/.local\` (set in the brev E2E
environment), because \`cli_path\` and \`shim_path\` resolve to the same
file (\`\$HOME/.local/bin/nemoclaw\`)
- The shim overwrites the real npm-linked binary with a wrapper that
\`exec\`s itself → infinite loop on invocation
- \`is_real_nemoclaw_cli()\` (added in NVIDIA#1606) correctly detected the
broken binary but misidentified it as the npm placeholder package →
\`npm uninstall -g nemoclaw\` → binary gone → install fails

**Fix:** Add a \`-ef\` guard before writing the shim. If \`cli_path\`
and \`shim_path\` refer to the same file, the binary is already in place
— return early.

**Why NVIDIA#1606 didn't catch this:** All local/unit tests use the default
npm prefix, which differs from \`NEMOCLAW_SHIM_DIR\`. The self-reference
only occurs when \`npm_config_prefix=\$HOME/.local\` is explicitly
exported — exclusive to the brev E2E runner. Before NVIDIA#1606 the broken
shim was silently accepted (no verification step existed).

Fixes regression from NVIDIA#1606, caught by E2E brev run
[24216985741](https://github.com/NVIDIA/NemoClaw/actions/runs/24216985741/job/70699774556).

## Test plan

- [x] Local test suite passes (\`npm test\`) — 12 pre-existing failures
in \`runtime-shell.test.js\` unchanged
- [ ] E2E brev CI passes Phase 2 (install.sh) on this PR

Signed-off-by: Dongni Yang <dongniy@nvidia.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Dongni Yang <dongniy@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…NVIDIA#1684)

## Summary

Add JSON Schema validation for all in-repo NemoClaw configuration files,
catching structural errors (missing fields, wrong types, unknown keys)
before review. Closes NVIDIA#553.

## Related Issue

Closes NVIDIA#553

## Changes

- `schemas/` — four JSON Schema files for `blueprint.yaml`,
`openclaw-sandbox.yaml`, policy presets (`presets/*.yaml`), and
`openclaw.plugin.json`
- `scripts/validate-configs.ts` — validation script with `OK:`/`FAIL:`
per-file output and field-level error messages (property path +
offending key)
- `test/validate-config-schemas.test.ts` — 22 Vitest tests (positive
validation + negative cases per schema)
- `basic-checks` CI step — runs on every PR and push to main via the
existing composite action
- `.pre-commit-config.yaml` — `validate-config-schemas` prek hook at
priority 10, triggers on changes to blueprint, policies, or schemas
- `package.json` — `ajv` devDependency + `npm run validate:configs`
convenience script

## Type of Change

- [x] Code change for a new feature, bug fix, or refactor.

## Testing

- [x] `npx prek run --all-files` passes (or equivalently `make check`).
- [x] `npm test` passes — 25 schema validation tests.
- [x] `npm run validate:configs` passes all 13 config files on current
branch.
- [x] Regression verified: restoring the pre-fix `blueprint.yaml` state
(`protocol: rest` without `rules`) triggers a field-level FAIL with exit
code 1 as expected.

## Checklist

### Code Changes

- [x] Formatters applied — `npx prek run --all-files` auto-fixes
formatting (or `make format` for targeted runs).
- [x] Tests added or updated for new or changed behavior.
- [x] No secrets, API keys, or credentials committed.
- [x] Doc pages updated for any user-facing behavior changes (new
commands, changed defaults, new features, bug fixes that contradict
existing docs).

---
Signed-off-by: Dongni Yang <dongniy@nvidia.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
…files (NVIDIA#188)

## Rationale
Auth profiles containing sensitive credentials were not consistently
protected with secure permissions across all environments.

## Changes
Implemented a recursive `find` command to enforce 600 permissions on all
`auth-profiles.json` files within the `.openclaw` directory.

## Verification Results
- [x] **Automated Tests**: Passed all 52 core tests via `npm test`.
- [x] **Manual Audit**: Verified 600 mode on multiple profile files.
- [x] **Security Review**: Verified correct permission enforcement.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Enforced restrictive file permissions for auth profiles and config
files across CLI scripts and migration flows, with resilient handling to
avoid script failures.
* **Documentation**
  * Updated the CLI reference link in the README’s Key Commands section.
* **Tests**
  * Expanded filesystem test mocks to cover permission-change behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Carlos Villela <cvillela@nvidia.com>
## Summary
- Installer `print_done()` hardcoded "Your OpenClaw Sandbox is live" and
`openclaw tui` regardless of which agent was onboarded. Now reads the
agent from the onboard session and adjusts the completion message
accordingly — non-OpenClaw agents get their own name and no `openclaw
tui` hint.
- Adds missing `agent_setup` step to `defaultSteps()` in
`onboard-session.ts`. Without it, `markStepStarted("agent_setup")`
silently no-ops because the step key doesn't exist in the session,
making the Hermes setup step invisible to resume logic and session
tracking.

## Test plan
- [ ] `NEMOCLAW_INSTALL_REF=main NEMOCLAW_AGENT=hermes curl -fsSL
.../install.sh | bash` — verify completion message says "Your Hermes
Sandbox is live" and does not show `openclaw tui`
- [ ] Same flow with default OpenClaw agent — verify message still says
"Your OpenClaw Sandbox is live" with `openclaw tui`
- [ ] `npm test` passes (1277 tests, 0 failures)

Closes NVIDIA#1618 (partial — installer integration for multi-agent onboard)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Onboarding completion messages now show the configured agent name
(defaults to OpenClaw).
  * Onboarding session now tracks an "agent setup" step.
* Next-step instructions are tailored per agent and omitted when not
applicable.

* **Bug Fixes / Improvements**
* Installer skips pre-extraction for non-OpenClaw agents and updates
progress text to reflect preparing agent dependencies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…VIDIA#1719)

## Summary
- Remove `github` and `github_rest_api` from baseline network policy
docs (now opt-in preset, NVIDIA#1660)
- Add `github` preset to presets tables in customize-network-policy.md
and best-practices.md
- Add `brave` and `brew` presets to best-practices.md
- Update `huggingface` description (download-only + inference router,
NVIDIA#1663)
- Update `npm` and `pypi` descriptions (GET-only, publishing blocked,
NVIDIA#1672)
- Fix binary-scoping example to reference `github` as a preset, not
baseline
- Add `docs/.docs-skip` exclusion file for suppressing docs on
experimental/unreleased features
- Update `nemoclaw-contributor-update-docs` skill with skip-features,
skip-terms, and agent matrix filtering rules
- Regenerate `nemoclaw-user-*` skills from updated docs

## Test plan
- [x] `make docs` builds without warnings
- [x] All pre-commit and pre-push hooks pass
- [x] Skip-term scan passes (zero violations in branch diff)
- [ ] Verify rendered pages in docs site preview

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Introduced skip-list configuration for controlling documentation
generation based on commit patterns and excluded terminology.

* **Documentation**
* Updated network policy presets with new `github`, `brave`, and `brew`
options.
* Refined preset descriptions for `huggingface`, `npm`, and `pypi` to
clarify access scopes.
* Clarified that GitHub access is no longer baseline and requires
applying the `github` preset during onboarding.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Problem

PR NVIDIA#746 (merged) made `prek` optional in the prepare script, but
introduced a regression: when `node_modules/@j178/prek` directory exists
but the `prek` binary is not in PATH, the script errors with `exit 1`.

This happens in practice when running `npm install --omit=dev` (as the
NemoClaw install script does) — the package directory may be present
from a previous full install, but the binary won't be linked to PATH.

Reported by @DanTup on a clean Ubuntu VM:
NVIDIA#746 (comment)

## Root Cause

The prepare script tried to distinguish between "prek intentionally
absent" and "prek should be here but isn't" by checking for the
`node_modules/@j178/prek` directory. However, this distinction is
unreliable — the directory can exist without the binary being available
(e.g., after `--omit=dev` with a stale `node_modules`).

## Fix

Remove the `elif [ -d node_modules/@j178/prek ]` branch entirely. If
`prek` is not in PATH, simply skip hook setup regardless of whether the
package directory exists.

**Before:**
```bash
if [ -d .git ]; then
  if command -v prek >/dev/null 2>&1; then prek install;
  elif [ -d node_modules/@j178/prek ]; then echo "ERROR: ..." && exit 1;  # ← BUG
  else echo "Skipping git hook setup (prek not installed)";
  fi
fi
```

**After:**
```bash
if [ -d .git ]; then
  if command -v prek >/dev/null 2>&1; then prek install;
  else echo "Skipping git hook setup (prek not installed)";
  fi
fi
```

## Test Scenarios

| Scenario | Expected | Result |
|----------|----------|--------|
| `.git` exists + `prek` in PATH | Runs `prek install` | ✅ |
| `.git` exists + `prek` not in PATH + `node_modules/@j178/prek` exists
| Skip (no error) | ✅ (was ❌ `exit 1`) |
| `.git` exists + `prek` not in PATH + no `node_modules` | Skip | ✅ |
| No `.git` directory | Skip | ✅ |

Fixes regression from NVIDIA#746.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Simplified the installation preparation script to streamline git hook
setup. When the required tooling isn't available, the script now
gracefully skips hook configuration (and logs a skip) instead of
failing, reducing setup friction for users.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
…IA#1722)

## Summary
- Restore config overrides, SOUL.md, NemoClaw plugin, and dashboard
message that were incorrectly removed in 53a440b ("strip invented
features")
- Symlink SOUL.md from immutable .hermes/ dir to writable
.hermes-data/memories/ so Hermes's ensure_hermes_home() doesn't crash on
PermissionError
- Export HERMES_HOME to .bashrc/.profile via the proxy snippet so
`hermes chat` uses the writable data directory

## Test plan
- [x] `hermes chat` works interactively inside the sandbox
- [x] NemoClaw plugin banner shows on session start
- [x] `nemoclaw_status` tool available in chat
- [x] Config includes terminal, agent, memory, skills, display sections
- [ ] Nightly E2E hermes-e2e job passes

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added sandbox management tools for querying system status and
configuration information.
* Integrated startup banner with system metadata injection during
session initialization.

* **Improvements**
* Enhanced gateway messaging with connection instructions for compatible
clients.
* Applied explicit default configuration values for core services
(terminal, agent, memory, skills, display).

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
## Summary
Remove the legacy `bin/lib` compatibility wrappers that no longer carry
behavior, and point the CLI/tests at the compiled `dist` output or local
TS modules instead. This trims the remaining root-level JavaScript
surface while keeping the launcher and the two wrappers that still
provide real compatibility behavior.

## Changes
- delete 18 obsolete `bin/lib/*.js` shims
- keep `bin/nemoclaw.js`, `bin/lib/usage-notice.js`, and
`bin/lib/credentials.js`
- retarget CLI source from `bin/lib/*` to local `src/lib/*` modules
where safe
- retarget scripts/tests/E2E helpers from removed shims to `dist/lib/*`
- include `dist/` in the published package
- update contributor/agent docs to reflect the new CLI layout

## Type of Change
- [x] Code change for a new feature, bug fix, or refactor.
- [ ] Code change with doc updates.
- [ ] Doc only. Prose changes without code sample modifications.
- [ ] Doc only. Includes code sample changes.

## Testing
- [x] `npx prek run --all-files` passes (or equivalently `make check`).
- [x] `npm test` passes.
- [ ] `make docs` builds without warnings. (for doc-only changes)

## Checklist

### General

- [x] I have read and followed the [contributing
guide](https://github.com/NVIDIA/NemoClaw/blob/main/CONTRIBUTING.md).
- [ ] I have read and followed the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md).
(for doc-only changes)

### Code Changes
- [x] Formatters applied — `npx prek run --all-files` auto-fixes
formatting (or `make format` for targeted runs).
- [x] Tests added or updated for new or changed behavior.
- [x] No secrets, API keys, or credentials committed.
- [x] Doc pages updated for any user-facing behavior changes (new
commands, changed defaults, new features, bug fixes that contradict
existing docs).

---
Signed-off-by: Carlos Villela <cvillela@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* CLI core migrated to TypeScript and internal launch paths
consolidated; legacy compatibility shims removed.
* **Chores**
  * Published package now includes compiled build artifacts.
* **Tests**
* Tests and helper scripts updated to run against compiled distribution.
* **Documentation**
* Docs and contribution guidance updated; CI guard added to prevent
reintroducing removed shims.
* **Notes**
  * No public APIs or user-facing behavior changed.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>
…chat restriction bypass (NVIDIA#897)

Telegram chat restriction bypass.

<!-- markdownlint-disable MD041 -->
## Summary
The current implementation fails to propagate the `ALLOWED_CHAT_IDS`
environment variable to the Telegram bridge child process. This results
in a security bypass where any user can interact with the bot regardless
of the whitelist settings.

Steps to reproduce:
```
$ env | grep ALLOWED
ALLOWED_CHAT_IDS=<redacted>
$ nemoclaw start
[services] telegram-bridge started (PID 42541)
...
$ ps -wwp 42541 -E | grep ALLOWED | echo "missed"
missed
```

## Related Issue
[Bug: ALLOWED_CHAT_IDS doesn't
work](NVIDIA#896)

## Changes
* `ALLOWED_CHAT_IDS` env var propagation

## Type of Change
- [x] Code change for a new feature, bug fix, or refactor.
- [ ] Code change with doc updates.
- [ ] Doc only. Prose changes without code sample modifications.
- [ ] Doc only. Includes code sample changes.

## Testing
- [x] `npx prek run --all-files` passes (or equivalently `make check`).
- [x] `npm test` passes.
- [x] `make docs` builds without warnings. (for doc-only changes)

## Checklist

### General

- [x] I have read and followed the [contributing
guide](https://github.com/NVIDIA/NemoClaw/blob/main/CONTRIBUTING.md).

### Code Changes
- [x] Formatters applied — `npx prek run --all-files` auto-fixes
formatting (or `make format` for targeted runs).
- [ ] Tests added or updated for new or changed behavior.
- [x] No secrets, API keys, or credentials committed.
- [ ] Doc pages updated for any user-facing behavior changes (new
commands, changed defaults, new features, bug fixes that contradict
existing docs).


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added support for configuring allowed Telegram chat IDs in deployment
credentials. This credential is automatically included in the deployment
environment only when a Telegram bot token is configured.

* **Tests**
  * Added test coverage for Telegram credential handling logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
…IA#1653)

## Summary

This draft PR advances the Discord/native messaging remediation work in
NemoClaw without claiming that native Discord is fully fixed end to end
yet.

It also folds in the analogous Telegram doctor/config follow-up so
rebuilt sandboxes match the current account-based channel schema and
avoid a misleading empty group allowlist warning.

It now also addresses the repo-owned portion of NVIDIA#1692 by removing
deprecated `tls: terminate` directives from the messaging policy
presets.

It does five repo-owned things:

1. narrows and supersedes the communication preset fix from NVIDIA#1084 by
adding only the minimal additional Node binary coverage
2. fixes onboarding so messaging policy suggestions follow the current
messaging selection rather than stale stored credentials
3. fixes the baked Discord account schema at image build time, adds
build-time Discord guild workspace config, and improves
troubleshooting/E2E validation so baked-config failures are separated
from native Discord gateway failures
4. aligns baked Telegram config with the same doctor-expected account
layout and makes Telegram group handling explicit so default group
delivery does not silently drop messages
5. removes deprecated TLS-termination syntax from the Discord, Slack,
and Telegram presets so NemoClaw does not keep shipping policy warnings
on current OpenShell builds

## Why this PR exists

The current evidence suggests the remaining Discord break is not just a
missing onboarding step:

- the sandbox has the Discord policy applied
- the Discord provider is attached
- command registration `PUT` requests are allowed at runtime
- the bot can authenticate against `discord.com`
- the bot can read channel history and send messages back into Discord
- current NemoClaw onboarding only baked token/account config, not the
Discord guild config that OpenClaw documents for server-channel autonomy

What still appears broken is the autonomous inbound side of native
Discord handling. In current testing, OpenClaw can read and write
Discord successfully, but incoming `@mention` messages do not appear to
wake the agent in real time. After baking the documented guild config
into the sandbox, the failure still reproduces. Direct diagnosis inside
the sandbox shows that:

- the baked `channels.discord.guilds.<serverId>` config is present and
correct
- Node DNS lookup for `gateway.discord.gg` succeeds
- a plain Node `WebSocket('wss://gateway.discord.gg/...')` succeeds
inside the same sandbox
- OpenClaw's Discord gateway client still loops with `AggregateError`
and close code `1006`

That makes the remaining issue look narrower than generic connectivity:
likely the live Discord gateway client path in the pinned OpenClaw
runtime (`2026.3.11`) under the OpenShell proxy environment.

This PR tightens NemoClaw’s side of the system so:

- the sandbox policy and onboarding behavior are correct
- the baked Discord config matches current OpenClaw schema expectations
- Discord server-workspace config can now be baked at image build time
instead of being missing from NemoClaw onboarding entirely
- onboarding can ask whether the bot should reply only to `@mentions` or
to all messages in the configured server
- the Discord user allowlist is optional, so leaving it blank allows any
member of the configured server to message the bot
- Telegram single-account config is already baked in the
`accounts.default` layout that `openclaw doctor` now expects
- Telegram group chats default to `groupPolicy: open`, so an empty DM
allowlist does not produce silent group-message drops
- messaging policy presets no longer pin deprecated `tls: terminate`
directives on current OpenShell builds
- users are not sent toward dead-end config repair steps
- tests explicitly probe the Discord gateway path instead of only REST

## Changes

- `bin/lib/onboard.js`
  - add `getSuggestedPolicyPresets()`
- make messaging-related policy suggestions respect explicit
`enabledChannels`
  - persist messaging channel selection through onboarding/resume flow
- `nemoclaw-blueprint/policies/presets/{discord,telegram,slack}.yaml`
  - add `/usr/bin/node` alongside `/usr/local/bin/node`
- remove deprecated `tls: terminate` directives and rely on current
OpenShell automatic TLS handling
- intentionally do **not** broaden to `curl` or `bash` in these narrow
presets
- `docs/reference/troubleshooting.md`
- document that `openclaw doctor --fix` cannot repair baked sandbox
channel config
- add guidance for distinguishing baked-config issues from native
Discord gateway issues
- add guidance for the analogous Telegram doctor migration/warning path
- `Dockerfile`
- bake messaging accounts under `channels.<provider>.accounts.default`
instead of `accounts.main`
- align the image with the schema `openclaw doctor` currently expects,
so this migration is handled at build time rather than failing inside
the immutable sandbox
- add a build-time `NEMOCLAW_DISCORD_GUILDS_B64` path so Discord guild
allowlist config can be baked into `openclaw.json`
  - set Telegram `groupPolicy: open` in the baked default account config
- `bin/lib/onboard.js`
  - prompt for Discord Server ID and User ID during messaging setup
- bake `channels.discord.guilds.<serverId>` config for the selected
server(s)
  - prompt for mention-only vs all-message replies in guild channels
- treat the Discord user allowlist as optional; leaving it blank allows
any member of the configured server
- keep `requireMention: true` as the default so onboarding remains safe
for shared servers
- `test/e2e/test-messaging-providers.sh`
- add a classified native Discord gateway probe separate from REST
reachability
- accept both `accounts.default` and `accounts.main` during the
transition so older images still validate
  - assert baked Telegram `groupPolicy: open`
- `test/onboard.test.js`
- add coverage for policy suggestions following current messaging
selection
  - add coverage for the baked Discord guild workspace build arg
- add coverage for both mention-only guild config and
open-to-all-members guild config
- `test/policies.test.js`
  - add coverage for `/usr/bin/node` in communication presets
- add coverage that messaging presets no longer ship deprecated `tls:
terminate`
- `.agents/skills/nemoclaw-user-reference/references/troubleshooting.md`
  - regenerated from docs

## Related issues

- supersedes NVIDIA#1084 for the preset change
- addresses NVIDIA#1610 from the NemoClaw side
- addresses NVIDIA#1692 from the NemoClaw side
- provides clearer handling for NVIDIA#606
- improves investigation coverage for NVIDIA#585 and NVIDIA#1570

## Validation

Completed:

- `git diff --check`
- `bash -n test/e2e/test-messaging-providers.sh`
- `rg -n "tls: terminate"
nemoclaw-blueprint/policies/presets/{telegram,discord,slack}.yaml
test/policies.test.js`
- `node -c bin/lib/onboard.js`
- manual host-side verification that the live sandbox has:
  - the Discord policy applied
  - the Discord provider attached
  - allowed Discord command-registration `PUT` traffic at runtime
- successful Discord read/write behavior from the bot in a real server
channel
- a remaining failure mode where direct `@mention` traffic is only
observed when OpenClaw is prompted from the TUI, rather than waking
autonomously in real time
- a correctly baked Discord guild config inside the sandbox after
redeploy
- successful `dns.lookup('gateway.discord.gg')` and raw Node WebSocket
connectivity from inside the sandbox
- continued OpenClaw Discord gateway failure with `AggregateError` and
WebSocket close code `1006`

Not completed in this environment:

- full `vitest` runs were attempted but hung without producing output in
this sandboxed session

## Current limitation

This PR does **not** yet claim a full end-to-end native Discord fix.

The remaining evidence points to the live Discord gateway client path
inside OpenClaw rather than to NemoClaw config. The next step is to
inspect or patch the pinned OpenClaw runtime behavior and determine
whether that final fix can be applied from NemoClaw or requires upstream
OpenClaw changes.

## Type of change

- [x] Code change for a bug fix or refactor
- [x] Doc updates
- [x] Tests updated


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Discord onboarding prompts for Server ID, reply mode (@mentions-only
or all messages) and optional user allowlist; messaging selections are
persisted into the sandbox image.

* **Documentation**
* Clarified Telegram allowlist: enforced for DMs only; group chats
remain open by default (prevents rebuilt sandboxes from dropping group
messages).
* Added troubleshooting for baked channel configs and Discord gateway
diagnostics.

* **Policy Changes**
* Messaging network policy TLS handling adjusted and Node binary paths
expanded for messaging providers.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Test User <test@example.com>
Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>
The validate-config-schemas test used new URL().pathname which
URL-encodes spaces as %20, breaking on paths like "Claude Code Projects".

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Sphinx requires every document to be in a toctree — the CI doc build
fails with -W (warnings as errors) without this entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The referenced doc does not exist yet, causing Sphinx -W to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Frontmatter description: flat string → main/agent structure
- Em dash → colon in title.page and H1
- Passive voice → active ("Skips malformed lines")
- {doc} cross-ref → relative link in docs, skill pointer in generated file
- Regenerated skill files via docs-to-skills.py
@brandonpelfrey brandonpelfrey self-assigned this Apr 14, 2026
@github-actions
Copy link
Copy Markdown

Brev E2E (full): FAILED on branch feat/audit-chainSee logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. priority: high Important issue that should be resolved in the next release security Something isn't secure status: rebase PR needs to be rebased against main before review can continue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: tamper-evident audit chain for sandbox events