Skip to content

[eval callback]Refactor eval callbacks into grid-based multi-condition sweep architecture#101

Open
Juyue wants to merge 3 commits into
mainfrom
dev/jchen/grid_eval_callbacks
Open

[eval callback]Refactor eval callbacks into grid-based multi-condition sweep architecture#101
Juyue wants to merge 3 commits into
mainfrom
dev/jchen/grid_eval_callbacks

Conversation

@Juyue
Copy link
Copy Markdown
Contributor

@Juyue Juyue commented Apr 20, 2026

Replace single-env push/payload callbacks with grid evaluation system that sweeps across multiple conditions in parallel. Velocity is always the base axis; either payload or push can be cross-producted with it (but not both, since they share the IsaacSim external force API).

Key changes:

  • GridConditionManager: unified add_axis() API for Cartesian product of sweep axes, shared across all callbacks
  • EvalRecordingCallback: single recorder with ensure_finalized() pattern for order-independent deferred setup
  • GridEvalVelocityCallback: independent velocity sweep (lin_x/y, ang_yaw)
  • GridEvalPayloadCallback: body-group x mass sweep with per-substep force injection via monkey-patched physics step
  • GridEvalPushCallback: body x direction x gait_phase x force sweep with GaitAnalyser for phase-triggered pushes
  • GaitAnalyser: standalone gait cycle detection from foot heights
  • sim_utils: resolve_body(), apply_body_force_world(), clear_body_force()
  • Config validation via @model_validator (body/label lengths, directions, gait phases, mutual exclusion of payload+push)

Minor:

  • Update src/holosoma/holosoma/config_values/loco/g1/reward.py to reduce visual noise.

…cture

Replace single-env push/payload callbacks with grid evaluation system
that sweeps across multiple conditions in parallel. Velocity is always
the base axis; either payload or push can be cross-producted with it
(but not both, since they share the IsaacSim external force API).

Key changes:
- GridConditionManager: unified add_axis() API for Cartesian product
  of sweep axes, shared across all callbacks
- EvalRecordingCallback: single recorder with ensure_finalized()
  pattern for order-independent deferred setup
- GridEvalVelocityCallback: independent velocity sweep (lin_x/y, ang_yaw)
- GridEvalPayloadCallback: body-group x mass sweep with per-substep
  force injection via monkey-patched physics step
- GridEvalPushCallback: body x direction x gait_phase x force sweep
  with GaitAnalyser for phase-triggered pushes
- GaitAnalyser: standalone gait cycle detection from foot heights
- sim_utils: resolve_body(), apply_body_force_world(), clear_body_force()
- Config validation via @model_validator (body/label lengths, directions,
  gait phases, mutual exclusion of payload+push)
@Juyue Juyue requested a review from samuelgundry April 20, 2026 19:02
Copy link
Copy Markdown
Contributor

@samuelgundry samuelgundry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part-way through, up to src/holosoma/holosoma/agents/callbacks/grid_eval_payload.py

Comment on lines +19 to +26
def _require_recording_cb(self) -> Any:
"""Find and return the EvalRecordingCallback, or raise."""
from holosoma.agents.callbacks.recording import EvalRecordingCallback

for cb in self.training_loop.eval_callbacks:
if isinstance(cb, EvalRecordingCallback):
return cb
raise RuntimeError(f"{type(self).__name__} requires EvalRecordingCallback. Set --recording.config.enabled=True")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there ever be a second eval CB? defensively, it can help to check here and error if so.


self._left_foot_z_range = (np.min(left_z, axis=0), np.max(left_z, axis=0))
self._right_foot_z_range = (np.min(right_z, axis=0), np.max(right_z, axis=0))
self._done = True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps set done at the end of the function, just in case of confusion w.r.t debugging

Comment on lines +48 to +51
def add_axis(
self,
name: str | list[str],
values: list,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens for duplicate axis names? "add" vs. "register". Good to document the behavior/expectations. You append below without checking but idk (yet) if you expect a single name.

edit: i see below you de-dupe. Good to document then, especially for callers -- right now, there's no way to know a duplication happened and the match is by key only? not values

unique: list[dict[str, Any]] = []
for cond in self.conditions:
key = tuple(sorted(cond.items()))
if key not in seen:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect at the least you'll want to log duplicates, but see above re: comparing keys vs. values, and what callers can/can't do if there are duplicates with different values

Comment on lines +152 to +163
def get_metadata(self) -> dict[str, Any]:
"""Return condition metadata for NPZ recording.

``grid_conditions`` uses hierarchical dicts grouped by the ``group``
parameter passed to ``add_axis()``. Ungrouped keys stay at
the top level.

Example condition::

{'velocity': {'lin_vel_x': 0.5, 'lin_vel_y': 0.0, 'ang_vel_yaw': 0.0},
'push': {'body_label': 'torso', 'direction': 'forward', 'force_n': 150.0}}
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know our current position towards unit testing, but this kind of logic is a good reason to add them, especially for future regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants