Skip to content

feat(docker): add Thor (Jetson) Docker image with inference + inference-ros targets#106

Open
kingb wants to merge 1 commit into
amazon-far:mainfrom
kingb:dev/kingbrnd/thor-docker-inference
Open

feat(docker): add Thor (Jetson) Docker image with inference + inference-ros targets#106
kingb wants to merge 1 commit into
amazon-far:mainfrom
kingb:dev/kingbrnd/thor-docker-inference

Conversation

@kingb
Copy link
Copy Markdown
Contributor

@kingb kingb commented Apr 28, 2026

Summary

Add docker/thor/ with two multi-stage parallel targets for running holosoma_inference on Jetson Thor (JetPack 7.1, Ubuntu 24.04 Noble, CUDA 13, aarch64 SBSA):

Target Includes Use case
inference policy + unitree_sdk2, no ROS Joystick / keyboard input, self-contained inference
inference-ros policy + unitree_sdk2 + ROS 2 Jazzy Ros2Input — subscribe to /cmd_vel from any ROS publisher.

Also adds docker/thor/compose.yaml (with runtime flags + env preset), docker/thor/.env.example, docker/thor/README.md, docker/thor/Makefile (scoped: make inference, make inference-ros, make run-inference ARGS=...), and docker/thor/scripts/run_*.sh launch helpers for common input-mode combinations.

Layer strategy

Stable → volatile, so code edits only rebuild the last layer:

l4t-cuda (CUDA 13 devel, Ubuntu 24.04)           ← never
 └─ python-base (python3.12, build tools, uv)    ← ~never
     │
     ├─ long-deps (NVPL, cuDSS, TensorRT libs)   ← ~never
     │   └─ common-deps (pydantic, scipy, pin…)  ← weekly-ish
     │       └─ app-deps (unitree_sdk2 + src)    ← every commit
     │           └─ inference                    ← terminal
     │
     └─ ros-jazzy (ros-base + FastDDS + CycloneDDS RMWs)  ← ~never
         └─ long-deps-ros (same as long-deps)             ← ~never
             └─ common-deps-ros (same as common-deps)     ← weekly-ish
                 └─ app-deps-ros (unitree_sdk2 + src)     ← every commit
                     └─ inference-ros                     ← terminal

The branches duplicate long-depsapp-deps because ROS 2 install mutates apt state; heavy layers still cache independently per branch, so day-to-day code edits don't rebuild TensorRT or ROS install.

DDS coexistence (inference-ros target)

unitree_sdk2's pybind11 wheel bundles CycloneDDS 0.10.2, which is ABI-incompatible with Jazzy's CycloneDDS 0.10.5 (C++ template signatures changed — free(): invalid pointer on participant init). Fix:

  • rclpy uses FastDDS (RMW_IMPLEMENTATION=rmw_fastrtps_cpp, set as image ENV default).
  • unitree_sdk2 keeps its bundled CycloneDDS 0.10.2.
  • FastDDS and CycloneDDS have disjoint binary symbol spaces → coexist in one process.
  • Entrypoint prepends /opt/venv/.../unitree_interface/ to LD_LIBRARY_PATH after sourcing ROS so Jazzy's libddsc doesn't hijack unitree's runtime lookup.

Cross-vendor DDS interop (CycloneDDS publisher → FastDDS subscriber) is hardware-validated for TwistStamped on /cmd_vel when both sides are on the same ROS 2 distro (Jazzy-to-Jazzy tested). See scripts/run_shuttle_publisher_cyclonedds.sh for the test.

Base image / pinning

  • Base: nvcr.io/nvidia/cuda:13.0.2-devel-ubuntu24.04 (matches JetPack 7.1).
  • Python: 3.12 (Noble native).
  • ROS 2: Jazzy from packages.ros.org.
  • unitree_sdk2: 0.1.3 via ARG UNITREE_SDK2_VERSION=0.1.3, fetched from github.com/amazon-far/unitree_sdk2 release assets.
  • Related: depends on the setup.py cp tag fix PR for native installs (Docker fetches the wheel directly, so technically not a blocker).

Scope

Not included in this first pass:

  • ZED SDK / pyzed — deferred; not needed for the blind-locomotion policy target.
  • PyTorch — ONNX Runtime handles policy inference; torch would add ~3 GB for no benefit here.
  • Any depth-perception / image-server pipeline — out of scope for this policy target.

Known tradeoff: Python deps duplicated between Dockerfile and setup.py

The common-deps stage duplicates the install_requires list from src/holosoma_inference/setup.py verbatim. Intentional — buys layered caching so day-to-day code edits only invalidate the final app-deps layer (~seconds) instead of re-running pip install (~30-60 s on aarch64). setup.py remains the source of truth; the Dockerfile comment calls this out as drift risk.

Alternatives considered and rejected for v1:

  • uv pip install -e .[unitree,booster] in a single stage — simpler, but every code change re-installs ~20 packages.
  • Call scripts/setup_inference_via_uv.sh inside the build — same single-layer cache cost, plus the script does laptop-oriented things (Ubuntu detection, sudo nvpmodel, etc.) that don't apply in a container.

Happy to switch if reviewers prefer single-source-of-truth over cache speed.

Test plan

  • docker build --target inference on Jetson Thor (aarch64 native) — builds cleanly.
  • docker build --target inference-ros — builds cleanly.
  • docker compose run --rm inference --help + inference-ros --help both start run_policy.py, register policy configs, no import errors.
  • run_joystick.sh (joystick+joystick) on real G1 — robot walks responsively.
  • run_ros2_joystick.sh + run_shuttle_publisher.sh (both FastDDS) on real G1 — robot shuttles forward/back correctly.
  • run_ros2_joystick.sh + CycloneDDS-based shuttle publisher on real G1 — robot shuttles. Cross-vendor DDS works.

Image sizes

  • inference: ~17.9 GB uncompressed (~6.85 GB compressed on disk).
  • inference-ros: ~18.4 GB uncompressed (~6.95 GB compressed).

Size is dominated by NVIDIA TensorRT runtime libs (~2.3 GB) and CUDA devel base — unavoidable for the platform.

@kingb kingb requested a review from tomasz-lewicki April 28, 2026 23:04
@kingb kingb changed the title feat(docker): add Thor (Jetson AGX) Docker image with inference + inference-ros targets feat(docker): add Thor (Jetson) Docker image with inference + inference-ros targets Apr 28, 2026
…ce-ros targets

Adds docker/thor/Dockerfile with two multi-stage parallel branches:

- `inference`: no-ROS policy + unitree_sdk2. Smaller image for joystick/keyboard
  input or users bringing their own velocity source.
- `inference-ros`: adds ros-jazzy-ros-base + rmw-cyclonedds-cpp for Ros2Input
  (subscribing to /cmd_vel from Nav2, for example).

Layer ordering optimizes for cache hits on code edits: CUDA base → python-base
→ long-deps (NVPL/cuDSS/TensorRT) → common-deps (pinocchio/scipy/etc.) →
app-deps (unitree wheel + COPY src) → terminal. Source is COPY'd last so
day-to-day changes only rebuild the final layers.

Platform: Jetson Thor, JetPack 7.1, Ubuntu 24.04 aarch64, CUDA 13.
Base image: nvcr.io/nvidia/cuda:13.0.2-devel-ubuntu24.04.

Includes:
- docker/thor/compose.yaml — Docker Compose with runtime flags preset
  (--runtime nvidia, host net/ipc, --privileged, CycloneDDS env for the ROS target)
- docker/thor/.env.example — MODEL_PATH override
- docker/thor/README.md — build/run commands, layer diagram, troubleshooting
- docker/thor/Makefile — scoped shortcuts: `make inference`, `make run-inference ARGS='...'`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kingb kingb force-pushed the dev/kingbrnd/thor-docker-inference branch from 5a7033c to 56c70a2 Compare April 28, 2026 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant