Skip to content

fix(setup/onecli): restrict admin API and postgres to loopback after OneCLI install#2434

Draft
glifocat wants to merge 2 commits into
nanocoai:mainfrom
glifocat:harden-onecli-binds
Draft

fix(setup/onecli): restrict admin API and postgres to loopback after OneCLI install#2434
glifocat wants to merge 2 commits into
nanocoai:mainfrom
glifocat:harden-onecli-binds

Conversation

@glifocat
Copy link
Copy Markdown
Collaborator

Type of Change

  • Feature skill - adds a channel or integration (source code changes + SKILL.md)
  • Utility skill - adds a standalone tool (code files in .claude/skills/<name>/, no source changes)
  • Operational/container skill - adds a workflow or agent skill (SKILL.md only, no source changes)
  • Fix - bug fix or security fix to source code
  • Simplification - reduces or simplifies source code
  • Documentation - docs, README, or CONTRIBUTING changes only

Description

Closes #2433. References onecli/onecli#268 and onecli/onecli#263.

OneCLI's installer auto-picks the docker0 bridge IP as ONECLI_BIND_HOST on bare-metal Linux, and its docker-compose.yml uses that single variable for all three published ports — the proxy gateway (:10255), the admin API (:10254), and Postgres (:5432). The proxy needs to be on the bridge so agent containers can reach it; the admin API and Postgres do not. The net effect is that on a default NanoClaw install on a Linux host, every agent container can read every agent access token via unauthenticated HTTP at http://<bridge-ip>:10254/api/agents, and can also connect to Postgres directly with the documented default credentials.

This PR adds a small post-install step in setup/onecli.ts that rewrites the generated compose file to pin :10254 and :5432 to 127.0.0.1, leaving :10255 on whatever bind the installer chose. The proxy stays reachable from agent containers; the admin API and Postgres become loopback-only.

Filed upstream as onecli/onecli#268 with proposed fixes for the installer + compose template. This downstream patch protects NanoClaw users in the meantime, and remains a useful defense-in-depth step even after upstream lands their fix.

What the patch does

  • New hardenOneCliBinds() does a targeted regex rewrite of the upstream compose, replacing the ${ONECLI_BIND_HOST:-127.0.0.1} prefix with a literal 127.0.0.1 on just the admin and Postgres port lines.
  • Writes a marker comment (# nanoclaw: admin+postgres pinned to loopback (onecli/onecli#268)) at the top of the file so a re-run is a no-op and the source of the change is discoverable.
  • If the upstream compose layout changes in a way we don't recognize, the function warns and leaves the file untouched — fail-loud-but-safe rather than corrupt the file.
  • Calls docker compose up -d to reconcile running containers. Failure here is non-fatal; the rewrite has already landed and a subsequent compose-up will pick it up.
  • --reuse mode does not rewrite (other apps may depend on the existing bind), but it does read ~/.onecli/.env, detect a non-loopback ONECLI_BIND_HOST, and emit a warning + UNSAFE_BIND in the status block so operators see the exposure.
  • emitStatus('ONECLI', ...) gains HARDENED + HARDEN_NOTE keys for observability.

Tests

setup/onecli.test.ts adds four unit tests that operate on a tempdir copy of the upstream compose shape — no docker calls, no real OneCLI install:

  • vanilla upstream compose → admin + postgres rewritten to 127.0.0.1, gateway left untouched, marker prepended
  • already-patched compose → no-op (reason: already_patched)
  • unrecognized layout → no-op + warn (reason: layout_unrecognized), file untouched
  • missing file → no-op + warn (reason: compose_missing)

All 327 tests pass locally; typecheck clean.

Why a regex instead of a YAML parser

The upstream compose is small, hand-written, and the lines we care about are exact-match strings. A regex against those exact strings catches drift loudly (we warn and no-op), where a YAML parser would happily round-trip a renamed key and silently do nothing. The marker comment is the durable signal anyone reading the file can find.

Scope

This patch is intentionally narrow. It does not change the OneCLI auth surface (that's upstream #263), and it does not touch the --remote-url path (those installs don't run a local gateway).

For Skills

  • SKILL.md contains instructions, not inline code (code goes in separate files)
  • SKILL.md is under 500 lines
  • I tested this skill on a fresh clone

@glifocat glifocat added follows-guidelines PR was created using the current contributing template PR: Fix Bug fix Priority: High High priority issue labels May 12, 2026
@glifocat glifocat marked this pull request as draft May 12, 2026 17:42
…ECLI_BIND_HOST through compose-up, detect via docker inspect

- hardenOneCliBinds now also rewrites ~/.onecli/config.json api-host and
  ~/.onecli/.env ONECLI_URL when they point at the bridge IP on :10254,
  so the host admin connection survives the loopback pin.
- applyHardenedCompose detects the gateway bind via docker inspect of the
  running :10255 port (fallback: ~/.onecli/.env) and passes ONECLI_BIND_HOST
  in the subprocess env so the proxy stays reachable from agent containers.
- detectUnsafeOneCliBinds now prefers docker inspect of :10254 as truth and
  falls back to ~/.onecli/.env, catching shell-set values that .env misses.
- Adds 15 new unit tests covering the config.json/.env rewrites,
  swapHostInAdminUrl, and detection paths via injected inspectFn.
@glifocat
Copy link
Copy Markdown
Collaborator Author

Pushed b0cdcee addressing the review. Summary of what changed:

Must-fix 1 — stale admin URLs after the loopback pin

hardenOneCliBinds() now also rewrites the OneCLI installer's two host-side admin URL stashes:

  • ~/.onecli/config.json — JSON-parse, swap api-host from http://<bridge-ip>:10254... to http://127.0.0.1:10254... via swapHostInAdminUrl(), write back with a trailing newline.
  • ~/.onecli/.env — line-level regex rewrites ONECLI_URL=... (handles quoted/unquoted, path suffixes) but only when host is non-loopback and port is :10254.

Both are best-effort — if either file is missing or unparseable, we log and continue (the security fix on the compose file has already landed). The gateway port :10255 is explicitly excluded from both rewrites.

The call site in run() also swaps the host portion of the URL extracted from onecli install stdout before passing it down to emitStatus/writeToEnv, so the env block ends up consistent with what's in config.json.

Must-fix 2 — docker compose up -d losing ONECLI_BIND_HOST

applyHardenedCompose() now resolves the gateway bind host first via detectGatewayBindHost():

  1. docker inspect onecli --format '{{json .NetworkSettings.Ports}}' and read HostIp for 10255/tcp — runtime truth.
  2. Fall back to ~/.onecli/.env ONECLI_BIND_HOST.
  3. Fall back to 127.0.0.1.

Then passes ONECLI_BIND_HOST=<resolved> in the subprocess env to docker compose up -d, so the unchanged ${ONECLI_BIND_HOST:-127.0.0.1} substitution on the :10255 line resolves to the original bridge IP — agent containers keep reaching the proxy.

Nit — detectUnsafeOneCliBinds() false negatives

Reordered the detection to call docker inspect of the :10254/tcp binding first, then fall back to ~/.onecli/.env. This catches shell-set ONECLI_BIND_HOST (where the installer's .env wouldn't reflect it) and also works on hosts where .env doesn't exist at all (which turns out to be the affected-box case).

About the marker on an already-patched compose

The marker line is the idempotency signal — a re-run sees it and returns { changed: false, reason: 'already_patched' }, so neither the compose nor the admin-URL stashes get rewritten on the next install. That means on a box that's been manually patched without the marker, the first nanoclaw setup will append the marker and rewrite anything still stale; subsequent runs are no-ops.

Tests

setup/onecli.test.ts expanded from 4 to 19 tests, organized into three describe blocks:

  • hardenOneCliBinds (9 tests): vanilla compose rewrite, marker idempotency, unrecognized layout no-op, missing file no-op, config.json :10254 rewrite, config.json :10255 preservation, config.json already-loopback no-op, .env ONECLI_URL :10254 rewrite, .env :10255 preservation, missing config+.env no-op.
  • swapHostInAdminUrl (4 tests): admin-URL swap with/without path, loopback unchanged, gateway URL unchanged, non-URL inputs unchanged.
  • detectUnsafeOneCliBinds (5 tests): non-loopback via inspect, loopback via inspect, inspect-fail → .env fallback, inspect-fail + loopback .env, inspect-fail + missing .env.

All tests use an injected inspectFn and tempdir paths — no real docker calls, no real OneCLI install. Full suite: 342/342 pass, typecheck clean.

cc @nanocoai/maintainers — ready for another pass when you have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

follows-guidelines PR was created using the current contributing template PR: Fix Bug fix Priority: High High priority issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(setup/onecli): restrict OneCLI admin API and Postgres to loopback after install

1 participant