Skip to content

improve: stricter input validation and default loopback in gr00t_inference#90

Open
cagataycali wants to merge 2 commits intostrands-labs:mainfrom
cagataycali:improve/groot-input-validation
Open

improve: stricter input validation and default loopback in gr00t_inference#90
cagataycali wants to merge 2 commits intostrands-labs:mainfrom
cagataycali:improve/groot-input-validation

Conversation

@cagataycali
Copy link
Copy Markdown
Member

Summary

Hardens the gr00t_inference tool with comprehensive input validation, a safer default bind address, and more precise process management.

Changes

1. Input validation for all user-supplied parameters

  • data_config and embodiment_tag: Validated against strict ^[a-z][a-z0-9_]+$ patterns — these are enumerable values from data_configs.json.
  • checkpoint_path and trt_engine_path: Reject shell metacharacters (;, |, $, backticks, etc.), null bytes, and .. path traversal.
  • container_name: Validated against Docker's container naming rules.
  • dtype values (vit_dtype, llm_dtype, dit_dtype): Checked against explicit allowlists (fp16, fp8, nvfp4).
  • Port range: Validated 1–65535.

2. Default host changed from 0.0.0.0127.0.0.1

Inference services should not bind to all interfaces by default. Users can still explicitly pass host='0.0.0.0' when remote access is needed.

3. Process verification in stop action

  • Added _is_gr00t_process() that inspects /proc/<pid>/cmdline to confirm a PID belongs to a GR00T inference process before sending signals.
  • Host-system fallback now uses pgrep -f inference_service.py (matches the actual process) instead of lsof -t -i:<port> (which matches any process on that port).

Motivation

As an agent tool, gr00t_inference receives string parameters that ultimately get interpolated into subprocess.run() calls and Docker exec commands. Validating these inputs up front is both a robustness improvement (fail fast with clear errors) and defense-in-depth against accidental misuse.

…rence

Improvements to the gr00t_inference tool:

1. Input validation for all user-supplied parameters:
   - data_config and embodiment_tag validated against strict alphanumeric
     patterns (they are enumerable values from the docstring).
   - checkpoint_path and trt_engine_path reject shell metacharacters,
     null bytes, and '..' traversal components.
   - container_name validated against Docker naming rules.
   - dtype values checked against explicit allowlists.
   - Port range validated (1-65535).

2. Default host changed from 0.0.0.0 to 127.0.0.1 (loopback):
   - Inference services should default to localhost-only binding.
   - Users can still explicitly pass host='0.0.0.0' when network
     access is needed.

3. Process verification for stop action:
   - Added _is_gr00t_process() to verify a PID belongs to a GR00T
     inference process before sending signals.
   - Host-system fallback now uses pgrep -f with the inference_service
     pattern instead of lsof (which matches any process on the port).
@cagataycali cagataycali requested a review from mrgh-test April 1, 2026 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

1 participant