Skip to content

feat/improve-dockerfile-and-run.sh#5

Open
WolframRavenwolf wants to merge 19 commits intongutman:mainfrom
WolframRavenwolf:feat/improve-dockerfile-and-run.sh
Open

feat/improve-dockerfile-and-run.sh#5
WolframRavenwolf wants to merge 19 commits intongutman:mainfrom
WolframRavenwolf:feat/improve-dockerfile-and-run.sh

Conversation

@WolframRavenwolf
Copy link

Hey Nimrod, thanks for creating the Clawdbot add-on for Home Assistant! I'd like to share the improvements I made for my own setup, as they will hopefully benefit you and other users.

Dockerfile improvements:

  • Use the full image instead of the slim version to provide access to more tools (for users and the AI) and reduce the need for manual package installations (like ca-certificates, curl, git, etc. already being included).
  • Install Bun via npm.
  • Include best installation options for ALL available requirements for bundled skills; users can easily comment these out individually.
  • Add optional Brew support to allow Homebrew package installation from the Dockerfile or Gateway Dashboard (commented out by default to prevent image bloat).
  • Install the GitHub and gog CLIs using Go.
  • Use "pip install" instead of "uv pip install" for "nano-pdf" and "openai-whisper" to store them ephemerally in the container like other tools, rather than persistently in the volume.
  • Separate Install (RUN) commands with clear comments to improve readability and usability, allowing users to easily comment out unneeded requirements.

run.sh improvement:

  • Enable the use of tags as branches so users can select stable release versions rather than the potentially unstable main branch.

@ngutman
Copy link
Owner

ngutman commented Jan 19, 2026

@WolframRavenwolf thank you for the PR!
There are few issues needed to be addressed -

  • Hardcoded x86_64 downloads will break the add‑on on ARM (aarch64, armv7). The Dockerfile pulls op_linux_amd64, himalaya.x86_64-linux.tgz, and openhue_Linux_x86_64.tar.gz, so those installs will fail or produce unusable binaries on non‑amd64. clawdbot_gateway/Dockerfile:65, clawdbot_gateway/Dockerfile:105, clawdbot_gateway/Dockerfile:128
  • go install ...@latest makes builds non‑reproducible and can change behavior without repo changes; this can break builds unexpectedly. Example occurrences: clawdbot_gateway/Dockerfile:74, clawdbot_gateway/Dockerfile:92

Also need to cleanup the Dockerfile a bit from commented commands.
Maybe let codex do a review pass? the addon should work on arm64 as well - pretty important

@WolframRavenwolf
Copy link
Author

Thanks for taking a look!

I have addressed all the issues:

  • Did a review pass as suggested.
  • Reordered sections for better clarity and build performance.
  • Fixed x86_64 downloads to ensure compatibility with ARM (aarch64, armv7).
  • Go versions default to latest, but allow individual overrides via --build-arg or by editing the Dockerfile.
  • Cleaned up non-essential commented-out commands/sections.

The updates are pushed and ready for your review.

@ngutman
Copy link
Owner

ngutman commented Jan 25, 2026

Review findings\n\n- High: Python CLI dependency conflict — installing then downgrades shared deps (, ), likely breaking earlier-installed tools. and .\n- Medium: Non-reproducible builds — many installs use or (Go/NPM/Pip), making builds brittle and hard to roll back. and throughout .\n- Medium: Startup/build time regression — the expanded toolchain causes long  ERR_PNPM_NO_PKG_MANIFEST  No package.json found in /Users/guti/projects/clawdbot-ha-addon; the container didn’t reach gateway start within 120s in my run. This could be worse on HA hardware. overall.\n\nIf you want, I can propose fixes (pin versions, isolate Python CLIs, or split “full tools” into an optional build).

@ngutman
Copy link
Owner

ngutman commented Jan 25, 2026

Review findings

  • High: Python CLI dependency conflict — installing nano-pdf then whisper-cli downgrades shared deps (typer, rich), likely breaking earlier-installed tools. clawdbot_gateway/Dockerfile:114 and clawdbot_gateway/Dockerfile:121.
  • Medium: Non-reproducible builds — many installs use latest or releases/latest (Go/NPM/Pip), making builds brittle and hard to roll back. clawdbot_gateway/Dockerfile:7 and throughout clawdbot_gateway/Dockerfile:75+.
  • Medium: Startup/build time regression — the expanded toolchain causes long pnpm install; the container didn’t reach gateway start within 120s in my run. This could be worse on HA hardware. clawdbot_gateway/Dockerfile overall.

@WolframRavenwolf
Copy link
Author

Good point regarding the nano-pdf and whisper-cli PIP installs. Since uv is available, I prefer using a uv environment, though I must adjust the container paths so Clawdbot locates it correctly. I will look into that.

As for using "latest": we already run "apt-get update && apt-get upgrade -y", which inherently results in a non-reproducible build. Pinning package versions mandates constant Dockerfile maintenance. I prefer defaulting to "latest" and letting users pin versions individually if they require stability over currency.

Ultimately, these changes are just intended to bring your Home Assistant add-on to feature parity with a full local Clawdbot installation regarding skill support. Fully local setups allow Clawdbot to auto-install requirements for base skills, but the container lacks full persistence and extra tools like Homebrew. Pre-installing the prerequisites directly into the container solves this and makes the add-on more valuable for Clawd users - it may make it slower to start the first time, but will save users a lot more time due to ease of use.

@niemyjski
Copy link
Contributor

niemyjski commented Feb 4, 2026

There are a lot of changes in this pr. I would really like the ability to define a bunch of docker packages to install maybe as a string and maybe go / pnpm packages to install as a string and the docker container could handle this and move more of these deps / versions into ha configuration values. Then less changes in the docker file. Right now I cannot use this image because I'm installing a bunch of apt dependencies that are not included here. As is this feels unmaintantable with the rapid release. Feels like moving image tags out to configuration and passing more of your deps in that you want to install is the way to go.

@WolframRavenwolf WolframRavenwolf force-pushed the feat/improve-dockerfile-and-run.sh branch 3 times, most recently from 136baaf to 7767ebf Compare February 7, 2026 16:52
WolframRavenwolf and others added 2 commits February 7, 2026 17:55
Installing nano-pdf then whisper-cli with pip3 caused dependency downgrades
(typer, rich), potentially breaking earlier-installed tools.

Solution: Use 'uv tool install' for isolated virtual environments per tool,
preventing dependency conflicts while maintaining tool availability.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Enhance git checkout logic to support:
- Remote branches (via origin/branch)
- Tags (fetched with --tags)
- Commit SHAs

Maintains Upstream openclaw-only changes while adding flexible ref support.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@WolframRavenwolf WolframRavenwolf force-pushed the feat/improve-dockerfile-and-run.sh branch from 7767ebf to 37f2ede Compare February 7, 2026 16:55
WolframRavenwolf and others added 8 commits February 7, 2026 17:56
Remove fork-specific URLs to ensure PR contains only technical changes.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Update WORKDIR from /opt/clawdbot to /opt/openclaw to match the
repository rename.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add BUN_INSTALL environment variable for bun installation path
- Add migrate.sh COPY and chmod to ensure migration script is available
- Clean up ENV section organization

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Update both README files to document all 28 included CLI tools:
- Main README: detailed list with descriptions for each tool
- Add-on README: categorized summary by tool type

Tools cover: Home Assistant, productivity, AI & media processing,
smart home control, communication, development, and utilities.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add clawhub for searching, installing, updating, and publishing
agent skills from clawhub.com.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add clawhub to both README files:
- Main README: detailed description
- Add-on README: added to Development category

clawhub enables searching, installing, updating, and publishing
agent skills from clawhub.com.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove the old clawdhub entry as it has been replaced by clawhub.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove clawdhub from both READMEs as it has been replaced by clawhub.
Tool count remains at 28.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@WolframRavenwolf
Copy link
Author

@ngutman I updated the PR and resolved the Python CLI dependency conflict.

IMHO, installing the latest tool versions by default is a feature, not an issue. This ensures users receive the most current versions immediately. Users requiring older or pinned versions can set these themselves. Since the apt installs/upgrades already prevent strict reproducibility anyway, I prefer this approach. However, feel free to edit the PR if you strictly prefer pinned packages (though that means you'd have to constantly update the Dockerfile for every single tool update).

I consider the first-time startup delay a fair trade-off. User time is more valuable than build time, and having all tools immediately available is a huge plus. If you want to disable specific tools by default to save build time, feel free to comment them out; users can then enable them in their forks (though that requires more effort than simply waiting a bit longer during the initial run).

@niemyjski Customizing installed tools via the Home Assistant UI to build images dynamically is a great idea. tried implementing it, but Home Assistant apparently only interprets options at runtime, not build time. Consequently, I reverted to the current implementation. By default, users get all requirements for the built-in tools. For anything else, they can create custom forks (and open PRs if their additions are universally useful enhancements).

I upgraded my Clawdbot to OpenClaw using this version. Everything looks good on my end, so I consider this ready to be merged.

@niemyjski
Copy link
Contributor

niemyjski commented Feb 14, 2026

I also feel like we need to install a bunch of deps to make sure the default image can use agent-browser and jq (lots of things use jq for terminal parsing)

libasound2 libatk-bridge2.0-0 libatk1.0-0 libatspi2.0-0 libcairo2 libcups2 libdbus-1-3 libdrm2 libgbm1 libglib2.0-0 libnspr4 libnss3 libpango-1.0-0 libx11-6 libxcb1 libxcomposite1 libxdamage1 libxext6 libxfixes3 libxkbcommon0 libxrandr2 libcairo-gobject2 libdbus-glib-1-2 libfontconfig1 libfreetype6 libgdk-pixbuf-2.0-0 libgtk-3-0 libharfbuzz0b libpangocairo-1.0-0 libx11-xcb1 libxcb-shm0 libxcursor1 libxi6 libxrender1 libxtst6 libsoup-3.0-0 gstreamer1.0-libav gstreamer1.0-plugins-bad gstreamer1.0-plugins-base gstreamer1.0-plugins-good libegl1 libenchant-2-2 libepoxy0 libevdev2 libgles2 libglx0 libgstreamer-gl1.0-0 libgstreamer-plugins-base1.0-0 libgstreamer1.0-0 libgtk-4-1 libgudev-1.0-0 libharfbuzz-icu0 libhyphen0 libicu72 libjpeg62-turbo liblcms2-2 libmanette-0.2-0 libnotify4 libopengl0 libopenjp2-7 libopus0 libpng16-16 libproxy1v5 libsecret-1-0 libwayland-client0 libwayland-egl1 libwayland-server0 libwebp7 libwebpdemux2 libwoff1 libxml2 libxslt1.1 libatomic1 libevent-2.1-7 libavif15 xvfb fonts-noto-color-emoji fonts-unifont xfonts-scalable fonts-liberation fonts-ipafont-gothic fonts-wqy-zenhei fonts-tlwg-loma-otf fonts-freefont-ttf jq

Copy link
Contributor

@niemyjski niemyjski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with latest vs pinned with how frequently this gets updated, which makes me feel like it should be args passed into it from the ha addon for verions. However, I had codex 5.3 do an audit (take it with a grain of salt) this is what it found. (please note I also included deps in this audit from the previous comment).

Security audit summary for this PR (#5):

High risk

  1. Supply-chain trust: many installs use floating latest tags (go install ...@latest, releases/latest, npm/pnpm globals) with no deterministic pinning.
  2. Download integrity: external binaries fetched via curl | tar / zip downloads without checksum/signature verification.

Medium risk
3) Credential exposure: runtime currently injects github_token into clone URL, which can leak via process args/logging/remote URL storage.
4) Expanded attack surface: large browser/media/system dependency footprint (GTK/WebKit/GStreamer/X11/Wayland/fonts) significantly increases CVE surface and patching burden.
5) Reproducibility: apt-get upgrade -y + floating tool versions make builds non-reproducible and harder to rollback.

Low/Medium
6) Ref input handling: broader branch/tag/commit checkout support increases input surface; ref validation should be strict.

Recommended remediations:

  • Pin all third-party tool versions (explicit semver/tag/SHA) and avoid latest defaults.
  • Add checksum or signature verification for downloaded artifacts.
  • Avoid embedding tokens in repo URLs; prefer authenticated headers/credential helpers.
  • Keep only required runtime packages, split optional tools, and run regular image scanning/SBOM.
  • Validate branch/ref values and reject unsafe patterns.

I checked both main and pr-5, and they both contain the same line in the PR snapshot:

TOKEN_OPT="$(jq -r .github_token /data/options.json)"
REPO_URL="https://${TOKEN_OPT}@${REPO_URL#https://}"

It exists at line 87 in both refs (main and pr-5), so my audit point should be read as a pre-existing security issue still present in that PR, not a new diff introduced by PR #5.

@niemyjski
Copy link
Contributor

@ngutman this pertains to the main branch as well (most of it)

Here's a comprehensive security analysis of this add-on:


Security Risk Assessment: OpenClaw Gateway HA Add-on

CRITICAL: Full Read-Write Access to HA Config

The most significant risk is in config.json:

"map": ["config:rw"]

This maps the entire /config directory into the container with read-write access. This means the add-on (and anyone with shell access to it) can:

Path Data Exposed
/config/secrets.yaml All your secrets — API keys, passwords, tokens
/config/configuration.yaml Full HA configuration
/config/.storage/auth Authentication provider data
/config/.storage/auth_provider.homeassistant User password hashes
/config/.storage/core.config_entries All integration credentials (e.g., cloud service tokens, MQTT passwords, smart lock codes)
/config/.storage/person Person/presence data
/config/automations.yaml All automations — can be modified to inject malicious ones
/config/scripts.yaml All scripts — can be modified
/config/custom_components/ Custom integrations — can be tampered with
/config/.storage/lovelace* Dashboard configs

Can it read HA data? Yes — essentially all of it. The config:rw mapping is the broadest file-system permission an add-on can request. It can read every secret, credential, and configuration file, and it can write to them too (injecting automations, modifying integrations, etc.).


HIGH: Host Network Mode

config.json:

"host_network": true
  • The container shares the host's full network stack — no network isolation
  • Can reach any device/service on your LAN (IoT devices, NAS, routers)
  • Can scan your network, reach the Supervisor's internal Docker network interfaces
  • The SSH server binds directly to the host interface, exposed to your entire LAN (or WAN if port-forwarded)
  • Even though hassio_api: false and homeassistant_api: false prevent the Supervisor from injecting API tokens, with host networking the container could reach the Supervisor API endpoint if it obtains a token from /config/.storage/

HIGH: SSH Root Shell

run.sh starts an OpenSSH server:

  • Runs as root (the Dockerfile never sets a non-root USER)
  • Anyone with an authorized SSH key gets a root shell inside the container
  • That root shell has full access to /config (all HA data) and the host network
  • The SSH server is on the host network, so misconfigured firewall = remote access to all your HA data
  • SSH key material is stored persistently at /config/openclaw/.ssh/

HIGH: Arbitrary Code Execution by Design

run.sh does:

  1. git clone a user-configured repo
  2. pnpm install — executes arbitrary postinstall scripts from npm packages
  3. pnpm build — runs whatever the repo's build script defines
  4. exec openclaw gateway — runs the cloned code as a long-lived process

This means whoever controls the repo_url repo controls your HA host. A compromised GitHub account, a malicious branch, or a supply-chain attack on any npm dependency in that repo leads to full code execution with root + config:rw + host network.


MEDIUM: GitHub Token Exposure

run.sh: The GitHub PAT is read from /data/options.json and exposed as environment variables (GIT_CONFIG_VALUE_0 containing the base64-encoded token). Any process in the container (including all 28+ CLI tools) can read it from /proc/*/environ or by inspecting git config.


MEDIUM: Unverified Binary Downloads

The Dockerfile downloads binaries from the internet without checksum or signature verification:

  • 1Password CLI (Dockerfile L146-L153): Version fetched dynamically from agilebits.com, binary downloaded and extracted — no SHA256 check
  • Himalaya (Dockerfile L190-L196): Downloaded from GitHub releases — no checksum
  • Openhue (Dockerfile L198-L207): Downloaded from GitHub releases — no checksum

A compromised CDN or DNS hijack during the Docker build could inject a trojanized binary.


MEDIUM: Supply Chain — 28+ Third-Party Tools

The image installs tools from many different authors/orgs:

  • 14 Go binaries from various GitHub repos (some @latest = unpinned)
  • npm/pnpm globals from steipete and Google
  • Python packages (homeassistant-cli, uv, nano-pdf, whisper-cli)
  • Any of these could be compromised upstream; steipete tools are all @latest (unpinned)

MEDIUM: homeassistant-cli Installed

Dockerfile: pip3 install homeassistant-cli — this is a CLI tool specifically designed to control Home Assistant via its REST API. Combined with the ability to read long-lived access tokens from /config/.storage/auth, this tool can:

  • Read/write entity states
  • Trigger services (unlock doors, disarm alarms, open garage doors)
  • Read sensor history

LOW: Persistent State in /config

The add-on stores all its data under /config/openclaw/ — this persists across restarts and add-on reinstalls. Cached credentials, cloned repos, npm caches, and SSH keys all survive. Uninstalling the add-on does not clean up this data.


Summary: Attack Surface Diagram

┌─────────────────────────────────────────────────────────┐
│  OpenClaw Add-on Container (runs as root)               │
│                                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ SSH Server   │  │ Gateway      │  │ 28+ CLI Tools│  │
│  │ (root shell) │  │ (cloned repo)│  │ (go/npm/pip) │  │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  │
│         │                 │                  │          │
│  ┌──────▼─────────────────▼──────────────────▼────────┐ │
│  │           /config (rw) — ALL HA DATA               │ │
│  │  secrets.yaml │ .storage/* │ automations │ configs  │ │
│  └────────────────────────────────────────────────────┘ │
│                                                         │
│  ┌────────────────────────────────────────────────────┐ │
│  │         host_network: true — FULL LAN ACCESS       │ │
│  │    Can reach: HA core, Supervisor, IoT, router     │ │
│  └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘

Recommendations to Reduce Risk

  1. Change "map": ["config:rw"] to a dedicated subfolder — e.g., ["config/openclaw:rw"] if HA supports it, or use addon_config instead. This is the single biggest improvement.
  2. Set "host_network": false and explicitly map only the ports needed (18789, 2222).
  3. Drop hassio_api / homeassistant_api comments — they're already false, which is good, but document why.
  4. Add checksum verification for all binary downloads (1Password, himalaya, openhue).
  5. Pin all tool versions — the steipete tools are still @latest.
  6. Run as non-root — add USER node or similar after build steps.
  7. Restrict SSH — consider binding SSH to 127.0.0.1 only, or removing it entirely if not needed.
  8. Scope the GitHub token — use a fine-grained PAT with minimal repo permissions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants