[FEATURE] Add support of visual-mesh to camera and raycasting-based sensors. by Kashu7100 · Pull Request #2769 · Genesis-Embodied-AI/genesis-world

Kashu7100 · 2026-05-09T19:53:59Z

Summary

Lets depth cameras and lidars cast against the visual mesh of a KinematicEntity, instead of the (often non-existent) collision hull.

Material flag

gs.materials.Kinematic(use_visual_raycasting=True) opts an entity into the visual BVH path. The flag must be set before scene.build() because the BVH is sized at build time.

Cross-solver raycast pipeline

base_sensor.RigidSensorMixin records, per sensor, the link's owning solver and link index. _gather_sensor_link_poses resolves link transforms across the rigid and kinematic solvers, so raycasters attached to kinematic entities resolve correctly even when the rigid solver is the primary BVH owner.
RaycasterSensor builds a "visual" BVH per solver that has any opted-in visual-raycasting entity, runs FK on visual verts where needed, and merges per-solver hits with a NaN-safe distance-min kernel.
Raycaster options override validate_scene to also accept KinematicEntity. The rigid mixin still rejects it for IMU/Contact/etc. — those sensors don't have cross-solver wiring.

Visual-vert plumbing

array_class exposes VVertsState; KinematicSolver now owns vverts_state.
kernel_update_all_vverts transforms init-pos vverts by their vgeom's pose into world-space vverts_state.pos for FK-driven entities.
For entities with has_custom_vverts (set via set_vverts from the deformable PR), kernel_copy_custom_vverts bypasses FK and copies the user's vertex buffer straight into vverts_state. Non-opted-in entities have their range pushed to a far-away invalidation sentinel so the BVH naturally skips them.

Static sensors

The static-sensor (entity_idx=-1) branch now honours user-provided pos_offset / euler_offset (the raycaster bakes these into ray_starts at build time).

Dependency

Depends on #2768 (Per-frame visual vertex deformation). The SMPL depth-camera example (examples/sensors/depth_camera_custom_vverts.py) calls entity.set_vverts(...), and the raycast pipeline reuses has_custom_vverts / _custom_vverts to bypass FK for skinned meshes. Merge order: #2768 → this PR.

Relationship to the previous PR

This PR is the raycast half of #2721. Both halves were originally bundled in #2721; splitting eases review.

Files changed

genesis/engine/materials/kinematic.py — adds use_visual_raycasting: bool field.
genesis/engine/entities/rigid_entity/rigid_entity.py — _use_visual_raycasting field, property + post-build setter guard, switches set_vverts lazy alloc to use the raycast invalidation sentinel.
genesis/utils/array_class.py — adds StructVvertsState, get_vverts_state, VVertsState data manager hookup.
genesis/utils/raycast_qd.py — visual-vert FK kernel, custom-vverts copy/invalidation kernels, AABB build, ray-cast kernel for visual BVH, multi-solver hit merge.
genesis/engine/solvers/rigid/abd/forward_kinematics.py — kernel_update_all_vverts (vgeom-pose → vvert world position).
genesis/engine/solvers/kinematic_solver.py — exposes vverts_state on the solver instance.
genesis/engine/sensors/base_sensor.py — per-sensor _sensor_link_solvers / _sensor_link_indices, static-sensor branch now honours pos_offset / euler_offset.
genesis/engine/sensors/raycaster.py — visual BVH build, multi-solver merge, per-frame FK / custom-vverts blit, NaN-safe no_hit_value guard.
genesis/options/sensors/options.py — Raycaster.validate_scene override that also accepts KinematicEntity.
examples/sensors/depth_camera_custom_vverts.py — deforming kinematic sphere + static box, ground-truth vs depth-camera renders side-by-side.

Test plan

python examples/sensors/depth_camera_custom_vverts.py -v renders a deforming sphere (opted-in) and a static box (not opted-in, invisible to depth cam) — depth camera shows only the sphere
python examples/sensors/depth_camera_custom_vverts.py -v -B 4 works batched
Existing depth camera / raycaster sensors against rigid bodies unchanged when no entity opts into use_visual_raycasting
Static sensors (entity_idx=-1) now honour pos_offset / euler_offset (regression of pre-PR behaviour)
IMU / Contact / ContactForce still reject KinematicEntity at validation (unchanged from main)

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2c64062cad

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

duburcqa · 2026-05-12T19:37:43Z

Plan — engine-owned visual vertex buffer (consolidates PR #2768 + PR #2769)

Status: Phase 0 + Phase 1 committed locally; Phase 2 pending design correction (see §1.3, §3)
Branch: vverts-consolidation in worktree /tmp/vverts-consolidation, off upstream/main (fe76e6b9)
Replaces: PR #2768 (closed) — Per-frame visual vertex deformation for KinematicEntity
Subsumes: PR #2769 (open) — Visual-mesh raycasting for KinematicEntity (depth camera, lidar)

1. Context and history

1.1 The two PRs we're consolidating

Two pull requests cover halves of the same feature:

[FEATURE] Per-frame visual vertex deformation for KinematicEntity #2768 (Kashu, closed) — let users overwrite a kinematic entity's visual vertices per frame, for externally-skinned meshes like SMPL. The original design added per-entity numpy buffers (_custom_vverts, _custom_vverts_dirty) on KinematicEntity and copied them to GL via a per-frame dirty-handshake with the rasterizer.
[FEATURE] Add support of visual-mesh to camera and raycasting-based sensors. #2769 (Kashu, open) — let depth cameras and lidars raycast against the visual mesh of a kinematic entity. Reuses [FEATURE] Per-frame visual vertex deformation for KinematicEntity #2768's _custom_vverts from a Quadrants kernel (kernel_copy_custom_vverts) to populate a separate solver-side buffer (vverts_state) that the raycast BVH consumes.

Both halves want the same data: per-frame world-space visual vertex positions for kinematic entities. #2768 stored them on the entity. #2769 added a parallel solver-side buffer and copied between them.

1.2 What was done on the now-closed PR #2768 (and our review of it)

The PR went through a long refactor cycle. Documenting each step because the lessons drove the redesign:

Round 0 — original PR as Kashu submitted it.

KinematicEntity._custom_vverts: np.ndarray | None + _custom_vverts_dirty: bool as private state on the entity.
Public API on the entity: set_vverts(vverts, envs_idx=None), has_custom_vverts property, clear_vverts().
Rasterizer accessed entity._custom_vverts and entity._custom_vverts_dirty directly across the module boundary.
RasterizerContext.skinned_nodes: dict[(env_idx, geom.uid), Node] for per-env GL nodes.
_update_rigid_custom_vverts(entity, geoms) helper that handled the migration forward (replace instanced node with per-env nodes) and the GL upload in one block.
Used detach().cpu().numpy() and np.asarray(envs_idx, dtype=int) — non-conforming with Genesis conventions.
Batched / non-batched code branch inside set_vverts (if self._solver.n_envs == 0: ... else: ...).
Migration forward only: clear_vverts flipped the dirty flag but the per-env GL nodes stayed; no path back to the instanced rigid_node.
No unit tests. Example examples/rendering/custom_visual_mesh.py existed but was not in CI.

Round 1 — moved state to RasterizerContext.
Our first instinct was that visual-vert overrides are rendering concerns, so we moved everything off the entity:

Added 4 collections on RasterizerContext: deformed_nodes, _rigid_entity_vverts, _rigid_entity_vverts_dirty, _entities_with_deformed_nodes.
Renamed set_vverts (entity) → Visualizer.custom_rigid_entity_vverts (visualizer-side public API).
Added migration back (when override is cleared, tear down per-env nodes and re-add instanced rigid_node).
Added Plane refusal at API entry.
Introduced broadcast_array + _expand_shape + _raise_broadcast_shape_error in genesis/utils/misc.py (numpy counterpart of broadcast_tensor, shared _expand_shape helper).
Added parametrized unit tests in tests/test_render.py.
Wired example into CI via tests/test_examples.py ALLOW_PATTERNS.

Round 2 — naming + helpers cleanup, multiple iterations.

Renamed skinned_nodes → custom_vverts_nodes (was deformed_nodes mid-flight).
Renamed public method: custom_rigid_entity_vverts → custom_kinematic_entity_vverts (entity is kinematic, not strictly rigid) → eventually set_custom_kinematic_entity_vverts (verb + noun pattern matching set_pos, set_qpos).
Consolidated the 4 collections into 2: custom_vverts_nodes + a single _custom_vverts: dict[uid, _CustomVverts] where _CustomVverts(buffer, active, dirty) is a dataclass packing all per-entity state.
Inlined add_skinned_node / add_custom_vverts_node (single-use helpers).
Then re-extracted add_rigid_node → add_geom_node (3 call sites: on_rigid, migrate-back, migrate-forward) so the seg-key + node creation logic isn't duplicated.
Inlined _seg_key_for_geom back into add_geom_node (single call site after the refactor).

Round 3 — realization that the entire architectural direction was wrong.
After looking at PR #2769's body more carefully (specifically kernel_copy_custom_vverts reading entity._custom_vverts from a Quadrants kernel, and the raycast pipeline depending on the same buffer):

The visualizer-only move was wrong. Sensors don't go through the visualizer (and shouldn't have to — they need to work headlessly).
The original entity-side buffer was also wrong. It was a parallel ground truth that [FEATURE] Add support of visual-mesh to camera and raycasting-based sensors. #2769 had to copy from into a separate vverts_state solver buffer. Two sources of truth.
The correct architecture is single engine-side ground truth (vverts_state on the solver) that FK populates by default and set_vverts overwrites in place. Renderer and raycaster both read it. No parallel buffers, no opt-in flag, no notification hooks between modules.

Kashu closed #2768 at this point. The work consolidates into one PR built around #2769's vverts_state architecture.

1.3 Lessons that drove the design (current cut)

Lesson	What it forced
Visualizer-only ownership broke sensors.	Move state to the engine (solver).
Entity-side parallel buffer forced an extra copy kernel in #2769.	Eliminate the parallel buffer; write directly into the solver's `vverts_state`.
`has_custom_vverts` + `_custom_vverts_dirty` + `set/clear/has` API was over-surfaced.	Reduce to `(set
Per-env normal recompute for deformed meshes is its own design problem.	Defer to a follow-up; in this PR, handle normals the same way the collision path does.
Per-frame upload cache via `(data_ptr, _version)` is a real optimization but its right shape depends on actual usage.	Defer to a follow-up; in this PR, always re-upload on the per-env visual path.

1.4 Design pivot during Phase 1 (after Phase 0+1 implementation)

Mid-implementation we discovered the original assumption "FK runs unconditionally every step, user write wins because it's last" doesn't work in Genesis:

Genesis convention is setters are called before step(), not after.
Sensors update inside step(). They read vverts_state during the step.
For a user set_vverts call to reach sensors, the value must survive step().
If FK populates vverts_state automatically during step (or in update_vgeoms pre-render), it clobbers the user's data.

So the previous "no override tracking on either solver or renderer" rule is wrong. The correct design needs per-entity tracking so FK can skip user-driven entities. Specifically, what PR #2769 originally proposed in its own description:

kernel_update_all_vverts transforms init-pos vverts by their vgeom's pose into world-space vverts_state.pos for FK-driven entities. For entities with has_custom_vverts (set via set_vverts), kernel_copy_custom_vverts bypasses FK and copies the user's vertex buffer straight into vverts_state.

We rejected this earlier in the design discussion (Round 2 Q1/Q2) on the basis of "no override notion outside the rasterizer". That was wrong — the solver-side branching is intrinsic to making the feature work at all, given the step → user-write → sensor-read ordering.

Concrete pivot:

Extend vverts_info with a per-vvert is_custom flag (gs.qd_int, shape (n_vverts_,)).
FK kernel (kernel_update_all_vverts) reads is_custom: skips entries where it's 1.
set_vverts flips the flag to 1 for the entity's vvert range AND writes the buffer.
set_vverts(None) flips the flag back to 0; FK takes over again next time it runs.
Renderer keeps a fast path for entities with all is_custom == 0: existing instanced Mesh.from_trimesh(poses=geom_T). Per-env vverts-buffer path only for entities where any vvert is custom.

This re-introduces the "per-entity migration" complexity in the rasterizer (instancing-vs-per-env) we tried to avoid, but explicitly — the user signs off on this trade-off as preferable to the alternative of "every kinematic entity always goes through the per-env path" which would regress the common-case render performance.

2. Current state of `main`

What exists in main today (as of the last fetch — upstream/main at the head of the imgui overlay merge):

Component	Status
`vverts_info` (topology: `init_pos`, `init_vnormal`, `vgeom_idx`), shape `(n_vverts_,)`	Exists
`vverts_state` (per-frame, batched)	Missing
FK kernel for visual vertices (`kernel_update_all_vverts` or equivalent)	Missing
Visual rendering path	Instanced `pyrender.Mesh.from_trimesh(poses=geom_T)` over `init_pos`
Collision / sdf rendering	Same instancing pattern
`(set	get)_vverts` public API at any level
Raycaster against visual mesh	Missing (collision hulls only)
`broadcast_tensor` helper in `genesis/utils/misc.py`	Exists
`broadcast_array` (the numpy counterpart we added on the closed branch)	Not in main, was on the closed PR branch only

The closed PR's branch (kashu/feat-deformable-vmesh at the last force-push, ea7b1103) still exists on Kashu's fork but is no longer connected to a live PR. It contains:

All the rasterizer-side state and dataclass
Visualizer.set_custom_kinematic_entity_vverts API
broadcast_array + _expand_shape + _raise_broadcast_shape_error helpers in misc.py
Parametrized unit tests in tests/test_render.py
examples/rendering/custom_visual_mesh.py wired into CI

For this consolidation we start fresh from upstream/main. Some elements from the closed PR's branch are worth pulling forward (see §8 "What to bring forward from the closed branch"); most should be left behind because they're predicated on the wrong architecture.

3. Design constraints (and the reasoning behind each)

Locked-in constraints, with the rationale that led to each:

3.1 Engine-owned single source of truth

Constraint. vverts_state lives on KinematicSolver. Every consumer (renderer, raycaster, future: contact-on-visual-mesh, debug overlays, etc.) reads from it directly.

Why. Both #2768 (renderer) and #2769 (raycaster) consume the same data. Putting it on the entity forced #2769 to add a copy kernel. Putting it on the visualizer would have made sensors depend on the visualizer being built (breaks headless setups). The engine is the natural owner because sensors and the renderer are both engine-state consumers.

3.2 Match the existing collision verts split

Constraint. Split into vverts_info (topology, unbatched, already exists) and a new vverts_state (per-frame, batched). Mirror the rigid solver's verts_info / verts_state pattern field-for-field.

Why. Consistency. The codebase already has the static-info / per-frame-state split for collision vertices. Replicating the pattern means the same naming, the same kernel registration boilerplate, the same idioms for downstream consumers. Anything else creates a one-off pattern that surprises readers.

3.3 Minimal `vverts_state` fields

Constraint. vverts_state.pos: gs.qd_vec3, (n_vverts_, B). No other fields.

Why. YAGNI. The rigid solver's verts_state may have additional fields (e.g. pos_grad for autodiff); we don't need them now. Adding fields preemptively bloats the substrate without a concrete user. We add them when an actual feature needs them.

3.4 No user-facing opt-in flag

Constraint. No use_custom_vverts: bool on material or morph. No _use_visual_raycasting flag either (was in #2769's original sketch). Calling set_vverts(...) is the only signal.

Why. Every kinematic entity already has visual vertices. The question "should this entity participate in the visual-vverts pipeline" has the same answer for all of them: yes. A flag would let users opt out, but opting out doesn't save anything meaningful (FK runs anyway, and the renderer needs vverts to draw). A flag is friction without a payoff.

3.5 "Is custom" tracking + the whole of `vverts_info` gated by `batch_vverts_info`

Constraint. A new KinematicOptions.batch_vverts_info: bool = False flag controls whether the entire vverts_info struct is batched per-env (matching the existing batch_links_info / batch_dofs_info / batch_joints_info conventions — the option toggles the whole struct, not individual fields). When batched, every vverts_info field gains a trailing B dimension; the FK kernel and any consumers index per (vvert, env) instead of per vvert.

vverts_info gains a new is_custom: gs.qd_int field (shape follows the gate). set_vverts(vverts, envs_idx) flips it to 1 for the affected cells; set_vverts(None, envs_idx) flips back to 0. FK (kernel_update_all_vverts) reads is_custom and skips entries flagged as user-driven.

When batch_vverts_info=False, set_vverts(..., envs_idx) with a non-None / non-all envs_idx raises ("requires batch_vverts_info=True for partial envs_idx").

Why. With Genesis's setter-before-step convention and sensors-in-step, FK cannot run unconditionally — the user's write would be clobbered before sensors read it. FK must skip user-driven entries. The tracking lives in vverts_info (engine side) so both FK and the renderer can read the same flag without further coupling.

Why the whole struct, not just is_custom. Per-env user override is only meaningful if the underlying init_pos / init_vnormal / vgeom_idx can also vary per env (e.g. different SMPL body shapes or different rest poses across envs). Decoupling is_custom's batching from the rest of the struct creates a half-batched mode that has no real use case. Following the Genesis convention (whole-struct gate) keeps the API and the kernel branching simple.

Why the gate. Batched-per-env costs n_vverts_ * B * field_size bytes — non-trivial for large meshes × many envs. Users who don't need per-env customisation shouldn't pay it. This matches how other *_info structs are batched conditionally elsewhere.

3.6 Renderer fast path for non-custom entities

Constraint. For an entity whose vverts are entirely non-custom (the common case for robots, fixed kinematic entities, etc.), the renderer keeps the existing instanced Mesh.from_trimesh(poses=geom_T) path. Only entities with any custom vvert go through the per-env vvert-buffer path.

Why. Instancing uploads per-env transforms (small) instead of per-env vertex buffers (potentially large). For a robot with 30k vverts × 64 envs, instancing is ~3 orders of magnitude cheaper per frame. The per-env path is correct only when needed.

Branching. Per-entity check at render time. Implementation will need a way to read the "is any vvert in this entity custom" answer from vverts_info.is_custom. Either:

The rasterizer keeps its own set of "currently custom" entity uids, populated when set_vverts is called (via a small entity-side bookkeeping or a callback that doesn't touch entity-to-visualizer reach-in).
The rasterizer scans vverts_info.is_custom on each update_rigid for each entity's vvert range. Cheap if cached or precomputed at solver level.

Design choice deferred to Phase 2 implementation; see §5 Phase 2 task list.

3.8 Collision / sdf rendering unchanged

Constraint. When entity.surface.vis_mode in ("collision", "sdf"), the rasterizer uses the existing instancing path (pyrender.Mesh.from_trimesh(poses=geom_T)). No change to that branch.

Why. Collision verts cannot be user-modified today (no set_verts API for collision geometry). Instancing is the right optimization for a static mesh with per-env transforms. Touching that path adds risk for zero benefit.

3.9 No caching on the per-env visual path (deferred)

Constraint. When the rasterizer takes the per-env vvert-buffer path, it re-uploads from vverts_state.pos every frame, unconditionally. No version-counter cache in this PR.

Why. See §6a follow-up. Adds engine-side state and bookkeeping that we can avoid in the initial cut; the per-env path is only used for entities the user explicitly drives via set_vverts, so typical scenes have at most a handful of them.

3.10 Normals: same handling as collision today

Constraint. Normals on the visual path are recomputed by the rasterizer using the same update_normal(node, positions) mechanism the collision path uses today.

Why. See §6b follow-up. Deformation-aware normal recompute (face-normal accumulation per vertex) may be needed for SMPL-quality shading but might also work fine with the existing rotate-by-vgeom-quat approach. Empirical check is needed; until then, do the same thing as collision so we're at least consistent.

3.11 Implementation on the solver, thin wrapper on entity, vgeom delegates to entity

Constraint. Canonical API on KinematicSolver. KinematicEntity has a thin wrapper. Vgeom delegates to the parent entity via a private range helper, not to the solver directly. Plane refusal lives in the entity wrapper (single check site).

Why. Consistent with set_pos / set_qpos / set_links_pos patterns elsewhere in Genesis. The entity wraps a range; the vgeom wraps a sub-range. Putting the Plane check in the entity means the vgeom inherits it for free via the shared private helper.

3.12 `get_vverts` returns a copy

Constraint. get_vverts returns a copy, not a view. Same as other Genesis getters (get_pos, get_quat, etc.).

Why. A writable view would let users mutate the buffer without going through set_vverts — they could bypass any future validation (Plane check, dtype enforcement, etc.) and the rasterizer's eventual cache-invalidation hook. Copies are the safe default and consistent with the rest of the API.

3.13 Zero-copy on `set_vverts`, warn otherwise

Constraint. set_vverts writes through to vverts_state.pos via Quadrants zero-copy when available. Warn once per scene build if gs.use_zerocopy is False.

Why. Zero-copy makes the write a memory aliasing operation rather than a memcpy. Critical for high-frequency use cases (SMPL at 60+ Hz with thousands of vertices). On non-zero-copy backends, the operation still works correctly but every call is a memcpy — the user should know.

3.14 `gs.morphs.Plane` refused at API entry

Constraint. set_vverts on a Plane entity raises immediately.

Why. Plane uses a special-case render setup in on_rigid (single-instance + reflection mat) that the per-env vverts path cannot reproduce. Rather than silently break the render, we refuse the operation.

3.15 Entities / vgeoms do not reach into camera / viewer / visualizer state

Constraint. No code in KinematicEntity or Vgeom accesses self._scene.visualizer, cam, or any rendering construct.

Why. Physics entities should not know about the renderer. The renderer should not be a required dependency for a kinematic entity to function. This is what made the "notify the rasterizer on set_vverts" hook a non-starter.

4. Final architecture

4.1 Engine layer

vverts_info (extended — adds is_custom; the whole struct is conditionally batched, matching batch_links_info / batch_joints_info conventions):

Without batch_vverts_info (default):

init_pos: gs.qd_vec3        # (n_vverts_,)
init_vnormal: gs.qd_vec3    # (n_vverts_,)
vgeom_idx: gs.qd_int        # (n_vverts_,)
is_custom: gs.qd_int        # (n_vverts_,)

With batch_vverts_info:

init_pos: gs.qd_vec3        # (n_vverts_, B)
init_vnormal: gs.qd_vec3    # (n_vverts_, B)
vgeom_idx: gs.qd_int        # (n_vverts_, B)
is_custom: gs.qd_int        # (n_vverts_, B)

New option: KinematicOptions.batch_vverts_info: bool = False.

vverts_state (new, minimal):

pos: gs.qd_vec3             # (n_vverts_, B)

KinematicSolver:

Holds vverts_state (registered through DataManager like vverts_info).
kernel_update_all_vverts — branches at qd.static compile time on batch_vverts_info:
- If batched: indexes is_custom[i_v, i_b] and skips per (vvert, env).
- If unbatched: indexes is_custom[i_v] and skips per vvert (same decision for every env).

Solver-level API:

def set_vverts(self, vvert_start, vvert_end, vverts, envs_idx=None): ...
def get_vverts(self, vvert_start, vvert_end, envs_idx=None) -> torch.Tensor: ...

set_vverts:

Resolve envs_idx via self._scene._sanitize_envs_idx.
If batch_vverts_info=False and envs_idx doesn't cover all envs: raise ("requires batch_vverts_info=True for partial envs_idx").
If vverts is None: flip the appropriate is_custom slice ([vvert_start:vvert_end] or [vvert_start:vvert_end, envs_idx] depending on the batched gate) to 0 — FK reclaims.
Else: broadcast via broadcast_tensor; write the target slice of vverts_state.pos; flip is_custom slice to 1. Warn once per build if not gs.use_zerocopy.

get_vverts:

Return a copy via qd_to_torch(..., copy=True).

4.2 Entity + Vgeom wrappers

class KinematicEntity:
    @gs.assert_built
    def set_vverts(self, vverts, envs_idx=None):
        self._set_vverts_range(self.vvert_start, self.vvert_end, vverts, envs_idx)

    @gs.assert_built
    def get_vverts(self, envs_idx=None) -> torch.Tensor:
        return self._get_vverts_range(self.vvert_start, self.vvert_end, envs_idx)

    def _set_vverts_range(self, vvert_start, vvert_end, vverts, envs_idx):
        if isinstance(self._morph, gs.morphs.Plane):
            gs.raise_exception("set_vverts is not supported for 'gs.morphs.Plane' entities.")
        self._solver.set_vverts(vvert_start, vvert_end, vverts, envs_idx)

    def _get_vverts_range(self, vvert_start, vvert_end, envs_idx):
        return self._solver.get_vverts(vvert_start, vvert_end, envs_idx)


class Vgeom:
    @gs.assert_built
    def set_vverts(self, vverts, envs_idx=None):
        self.entity._set_vverts_range(self.vvert_start, self.vvert_end, vverts, envs_idx)

    @gs.assert_built
    def get_vverts(self, envs_idx=None) -> torch.Tensor:
        return self.entity._get_vverts_range(self.vvert_start, self.vvert_end, envs_idx)

4.3 Renderer layer (`RasterizerContext`)

Branching is per entity, decided at update time by whether any of the entity's vverts are currently custom-flagged:

Default path (non-custom entities, common case): existing instancing — Mesh.from_trimesh(poses=geom_T). Unchanged from main. Stored in rigid_nodes.

Per-env path (entities with at least one custom vvert):

Build / lazy migration: when an entity transitions from non-custom to has-custom, tear down its instanced rigid_node and build per-env nodes (pyrender.Mesh.from_trimesh(mesh=geom.get_trimesh()), no poses). Store in vverts_nodes: dict[(env_idx, geom.uid), Node].
Update: read qd_to_torch(solver.vverts_state.pos, transpose=True), slice per vgeom, add envs_offset, push to each per-env node's pos buffer. Compute normals via the existing update_normal path (same as collision today). Re-upload every frame (no cache — see §6a).
Migration back: when an entity transitions from has-custom to non-custom (user called set_vverts(None)), tear down per-env nodes and recreate the instanced rigid_node.

Tracking which entities are in which path lives on the rasterizer (_per_env_vverts_entity_uids: set[int]). The signal that an entity needs migration is vverts_info.is_custom-derived — the rasterizer can compute per entity by checking whether is_custom[entity.vvert_start, :] has any non-zero entry. Once an entity migrates to the per-env path it stays there until the user clears it across all envs.

Collision / sdf rendering: unchanged. Plane: refused at API entry; never reaches the per-env path.

4.4 Sensor layer (raycaster — was PR #2769)

Visual-BVH per kinematic solver. Reads vverts_state.pos.
Per-frame BVH refresh.
Cross-solver link-pose plumbing on RaycasterSensor (_sensor_link_solvers, _sensor_link_indices).
NaN-safe distance-min merge across solvers.
Raycaster.validate_scene accepts KinematicEntity unconditionally — no opt-in flag.

5. Stages

Each phase has an exit criterion. Don't move on until it's satisfied.

Phase 0 — substrate ✅ DONE (`1705d9ed`)

Goal: vverts_state exists, FK populates it every step, nothing else changes.

#	Task	File(s)	Status
0.1	Add `VVertsState` dataclass + builder `get_vverts_state(solver)`, only `pos: gs.qd_vec3, (n_vverts_, B)`	`genesis/utils/array_class.py`	✅
0.2	Register `vverts_state` in `DataManager` next to existing `vverts_info` registration	`genesis/utils/array_class.py`	✅
0.3	Expose `self.vverts_state = self.data_manager.vverts_state` on `KinematicSolver`	`genesis/engine/solvers/kinematic_solver.py`	✅
0.4	Implement `kernel_update_all_vverts(vverts_info, vverts_state, vgeoms_state, static_rigid_sim_config)` — vgeom-pose × `init_pos` → `pos`	`genesis/engine/solvers/rigid/abd/forward_kinematics.py`	✅ (will be amended in Phase 1.5 to read `is_custom`)
0.5	Wire `kernel_update_all_vverts` into `KinematicSolver.update_vgeoms`	`genesis/engine/solvers/kinematic_solver.py`	✅
0.6	Smoke test: build a scene, step once, verify `qd_to_torch(vverts_state.pos)` is non-trivial	local	✅ — `(1, 642, 3)`, mean 0.1667 for the sphere example

Exit criterion met: vverts_state.pos is populated correctly when the visualizer runs update_vgeoms.

Phase 1 — public API ✅ DONE (`41bc0a4f`) — needs Phase 1.5 follow-up

Goal: users can call set_vverts / get_vverts at the solver, entity, and vgeom levels.

#	Task	File(s)	Status
1.1	`KinematicSolver.set_vverts(vvert_start, vvert_end, vverts, envs_idx=None)`	`genesis/engine/solvers/kinematic_solver.py`	✅ (no `is_custom` yet — Phase 1.5)
1.2	`KinematicSolver.get_vverts(vvert_start, vvert_end, envs_idx=None) -> torch.Tensor`	same	✅
1.3	Warn-once mechanism on `set_vverts` if `not gs.use_zerocopy`. Per-solver `_set_vverts_warned` flag	same	✅
1.4	`KinematicEntity._set_vverts_range` + `_get_vverts_range` (Plane refusal + solver delegate) and public `set_vverts` / `get_vverts`	`genesis/engine/entities/rigid_entity/rigid_entity.py`	✅
1.5	`Vgeom.set_vverts` / `get_vverts` delegating to entity's private range methods	`genesis/engine/entities/rigid_entity/rigid_geom.py` (`RigidVisGeom`)	✅
-	Smoke test: round trip, scalar broadcast, `(3,)` broadcast, get-returns-copy, Plane raises, vgeom-level set	local	✅
-	Fallback kernel `kernel_set_vverts` for non-zerocopy backends	`genesis/engine/solvers/rigid/abd/forward_kinematics.py`	✅

Known issue at end of Phase 1: because FK in update_vgeoms overwrites vverts_state every render, the user's set_vverts data is clobbered before render reads it. Resolved by Phase 1.5.

Phase 1.5 — engine-side `is_custom` flag (design pivot)

Goal: FK can skip user-driven vverts so set_vverts survives step() and update_vgeoms.

#	Task	File(s)
1.5.1	Add `batch_vverts_info: bool = False` to `KinematicOptions` (mirror `batch_links_info` / etc.)	`genesis/options/solvers.py` (or wherever `KinematicOptions` lives)
1.5.2	In `get_vverts_info`: every existing field (`init_pos`, `init_vnormal`, `vgeom_idx`) becomes `(n_vverts_, B)` when `batch_vverts_info=True`, stays `(n_vverts_,)` otherwise. Add new field `is_custom: qd.Tensor` with the same gated shape.	`genesis/utils/array_class.py`
1.5.3	Update `kernel_init_vvert_fields` to write the per-env replicated topology / init data when batched, single copy when not	`genesis/engine/solvers/rigid/abd/init_field.py` (or wherever the existing kernel lives)
1.5.4	Confirm Quadrants default zero-init covers `is_custom = 0` at build (no explicit init needed)	`genesis/engine/solvers/kinematic_solver.py`
1.5.5	Extend `kernel_update_all_vverts` to read `vverts_info` with `qd.static` branch on `batch_vverts_info` (per-`(v, b)` vs per-`v` indexing for all reads) and check `is_custom` to skip user-driven entries	`genesis/engine/solvers/rigid/abd/forward_kinematics.py`
1.5.6	In `KinematicSolver.set_vverts`: raise on partial `envs_idx` when not batched; write `vverts_state.pos` slice + flip `is_custom` accordingly. Zero-copy where available, kernel fallback otherwise.	`genesis/engine/solvers/kinematic_solver.py` + kernel file
1.5.7	Support `vverts=None` clear path: flip `is_custom` to 0, don't touch `pos`	same
1.5.8	Plumb `None` through `KinematicEntity.set_vverts` and `Vgeom.set_vverts`	entity / vgeom files
1.5.9	Audit all other consumers of `vverts_info` in the codebase — every `vverts_info.<field>[i_v]` access needs to branch on `batch_vverts_info` and add an `i_b` index when batched	grep + audit
1.5.10	Smoke test (unbatched): `set_vverts → step → get_vverts` returns user data; `set_vverts(None) → step → get_vverts` returns FK output; partial `envs_idx` raises.	local
1.5.11	Smoke test (batched): partial `envs_idx` correctly mixes user / FK across envs.	local

Exit criterion: set_vverts data survives step() for the affected vvert range; set_vverts(None) correctly hands control back to FK.

Phase 2 — renderer rework (with fast path)

Goal: entities with at least one custom vvert render via per-env vvert buffers; all other entities keep the existing instancing path (unchanged from main).

#	Task	File(s)
2.1	Add `vverts_nodes: dict[(env_idx, geom.uid), pyrender.Node]` and `_per_env_vverts_entity_uids: set[int]` to `RasterizerContext.__init__`	`genesis/vis/rasterizer_context.py`
2.2	Clear both in `destroy`	same
2.3	In `on_rigid`: no change at build time. Every entity initially uses the instancing path; per-env nodes are built lazily on first migration.	same
2.4	In `update_rigid`, per kinematic entity: probe `is_custom[entity.vvert_start, :]`. Transition non-custom → has-custom: migrate forward (build per-env nodes, tear down instanced `rigid_node`, add to `_per_env_vverts_entity_uids`). Transition has-custom → non-custom (all envs cleared): migrate back.	same
2.5	In `update_rigid`, entities in `_per_env_vverts_entity_uids`: read `qd_to_torch(solver.vverts_state.pos, transpose=True)`, slice per vgeom range, add `envs_offset`, push via `reorder_vertices` + `jit.update_buffer(node, "pos")` + `jit.update_normal` + `update_buffer(node, "normal")`. Re-upload every frame (no cache — see §6a follow-up).	same
2.6	In `update_rigid`, entities NOT in `_per_env_vverts_entity_uids`: unchanged from main (instancing path).	same

Exit criterion: scenes with set_vverts-driven entities render the user's vverts; scenes without any set_vverts render exactly as today (no perf regression, byte-identical output).

Phase 3 — raycaster (was PR #2769)

Goal: depth cameras / lidars can raycast against visual meshes.

#	Task	File(s)
3.1	Visual-BVH build per kinematic solver, reading `vverts_state.pos`	`genesis/utils/raycast_qd.py`
3.2	Per-frame BVH refresh kernel — recompute AABBs from current `vverts_state.pos`	same
3.3	Ray-cast kernel against the visual BVH	same
3.4	Multi-solver hit merge (NaN-safe distance-min)	same
3.5	Per-sensor `_sensor_link_solvers` + `_sensor_link_indices`; cross-solver link pose resolution in `_gather_sensor_link_poses`	`genesis/engine/sensors/base_sensor.py`
3.6	`RaycasterSensor` per-frame FK / BVH refresh hookup	`genesis/engine/sensors/raycaster.py`
3.7	`Raycaster.validate_scene` accepts `KinematicEntity` unconditionally	`genesis/options/sensors/options.py`
3.8	Static-sensor branch (`entity_idx=-1`) honors `pos_offset` / `euler_offset` (was a regression fix in #2769)	`genesis/engine/sensors/base_sensor.py`

Exit criterion: depth camera against a kinematic mesh produces correct depth; ground-truth comparisons match.

Phase 4 — tests + examples

Goal: regression coverage and a runnable example.

#	Task	File(s)
4.1	Unit: solver-level `set_vverts` / `get_vverts` round trip — write a slice, read it back, verify byte-equal	`tests/test_render.py` or new `tests/test_kinematic_solver.py`
4.2	Unit: `KinematicEntity.set_vverts` writes the entity's range	same
4.3	Unit: `Vgeom.set_vverts` writes only its slice; sibling vgeoms unaffected	same
4.4	Unit: `get_vverts` returns a copy — mutating it does not affect the underlying buffer	same
4.5	Unit: `set_vverts` on `gs.morphs.Plane` raises	same
4.6	Unit: `set_vverts` accepts scalar / `(3,)` / `(n_v, 3)` / `(B, n_v, 3)` via `broadcast_tensor`	same
4.7	Render: baseline → `entity.set_vverts(deformed)` → render differs; same scene rendered in `vis_mode="collision"` unchanged across the call	`tests/test_render.py`
4.8	Raycast: depth camera against a kinematic mesh — FK-driven case (no `set_vverts`)	`tests/test_sensors.py`
4.9	Raycast: depth camera with user-driven deformation via `set_vverts`	same
4.10	Example: `examples/sensors/depth_camera_custom_vverts.py` (kinematic mesh + depth camera, ground-truth vs depth render side by side)	new file
4.11	Wire example into `tests/test_examples.py` (`ALLOW_PATTERNS`)	`tests/test_examples.py`

Exit criterion: all phases' tests pass locally and on CI.

6. Follow-ups (NOT in this PR)

Documented now so we don't lose the context. Both are real but deferring keeps this PR architecturally clean.

6a. Cache invalidation in the rasterizer per-env path

What we'd want. The per-env vverts path re-uploads vverts_state.pos every frame. Cheaper would be to skip the GL upload when nothing changed since the last frame.

What's needed. An engine-side signal that "the slice for (entity, env) changed since you last looked". Options:

A counter in vverts_info (e.g. custom_version: gs.qd_int, (B,) or (n_kinematic_entities_, B)) bumped by set_vverts.
Torch's _version on the underlying storage — too coarse (per-storage, not per-slice), so probably not.

Why deferred. Extra engine-side state and bookkeeping; only matters when custom-vvert entities update at sub-frame cadence or sit idle for many frames. Until profiling shows it's a real cost, the unconditional re-upload is simpler.

Decisions deferred. Granularity (per-env, per-entity-per-env, per-vvert-per-env); exactly where the counter lives.

6b. Deformation-aware normal recompute on the visual path

What we'd want. When set_vverts supplies deformed positions, vertex normals should reflect the deformed geometry (per-face cross-product accumulated to vertices), not the rigid-transformed init_vnormal.

Why deferred.

The current rasterizer normal path (the update_normal we reuse in Phase 2) was designed for rigid transforms of a static mesh. Whether it produces correct normals when fed arbitrary deformed positions needs an empirical check — SMPL is the obvious test case.
Recomputing normals on the rasterizer CPU side, on the GL side (vertex / geometry shader), or as a Quadrants kernel populating vverts_state.vnormal are three different cost / quality trade-offs. Choosing without measurement is guesswork.
A dedicated kernel could populate vverts_state.vnormal (sibling field) and a set_vnormals sibling API could let users override normals too. That doubles the substrate surface area — only worth it if measurement shows it's needed.

Decisions deferred until then.

Where normals are computed (rasterizer CPU / GL shader / solver kernel).
Whether vverts_state gains a vnormal field.
Whether we add set_vnormals / get_vnormals mirroring set_vverts / get_vverts.

7. Notes / gotchas to remember while implementing

_sanitize_envs_idx lives on Scene, not the solver. Pattern: self._scene._sanitize_envs_idx(envs_idx).
qd_to_torch(..., transpose=True) gives (B, n, ...) view. Slice as [envs_idx, vvert_start:vvert_end, :] not [vvert_start:vvert_end, envs_idx, :].
broadcast_tensor is in genesis.utils.misc. We do not need broadcast_array (we added it on the closed branch but with everything torch-side now, the torch variant is enough).
Vgeom location: confirm whether it lives in rigid_vgeom.py, somewhere under engine/entities/, or on the solver. Find it before Phase 1.5.
The Plane special-case in on_rigid (single-instance + reflection) must survive untouched. set_vverts refusal is enforced at the entity wrapper, so the rasterizer never sees a Plane in the per-env path.
Pre-commit reformats. Always git add -A && git commit ... and check the result with git status after.
Per-push approval rule: never push without explicit go-ahead per push. Prior approvals don't extend to amendments.
gs.metal backend requires zero-copy ≥ torch 2.9.1. Older setups will hit the warn path on set_vverts.
Genesis convention: solver-level setters take idx_start, idx_end ranges (or index arrays), not entity references. The entity-level wrapper supplies the range from self.vvert_start / self.vvert_end.

8. What to bring forward from the closed PR's branch

The closed branch (kashu/feat-deformable-vmesh at ea7b1103 if it hasn't been GC'd) has work we should reuse:

Asset	Reuse?	Notes
`examples/rendering/custom_visual_mesh.py` (wave-deform box + SMPL paths)	Yes, with adjustments. The new path calls `entity.set_vverts(...)` directly (no more `scene.visualizer.set_custom_kinematic_entity_vverts(...)`).
`tests/test_render.py::test_set_vverts` parametrized over `n_envs`	Yes, with API updated to new `entity.set_vverts`.
CI wiring in `tests/test_examples.py::ALLOW_PATTERNS`	Yes — the line `"rendering/custom_visual_mesh.py",` is needed.
`broadcast_array` + `_expand_shape` + `_raise_broadcast_shape_error` helpers in `genesis/utils/misc.py`	No. Everything is torch-side now; existing `broadcast_tensor` covers the use case.
`RasterizerContext._CustomVverts` dataclass + `_custom_vverts` dict + dirty/active tracking	No. This is the architecture we're replacing.
`Visualizer.set_custom_kinematic_entity_vverts` shim	No. Public API moved to `entity.set_vverts`.
`KinematicEntity._custom_vverts` + `has_custom_vverts` + `clear_vverts`	No. Eliminated entirely; replaced with engine-side `vverts_state` + entity-level `(set	get)_vverts` thin wrappers.
`Plane` refusal at API entry	Yes, in the new entity wrapper.
`add_geom_node` refactor of `add_rigid_node`	Maybe. Useful if Phase 2 needs to share node-creation between visual and collision paths, but the two paths build different mesh objects (`from_trimesh(...)` with vs without `poses`), so the win is smaller now. Decide during Phase 2.1.

9. Open questions before starting

None — design is locked. Phase 0 ready to start whenever the user gives the go-ahead.

10. Rollback / off-ramp

If Phase 2 turns out to regress visual render performance unacceptably (most likely on software OpenGL setups: Mesa llvmpipe, Apple Software Renderer), the fallback is to bring back per-entity branching in the rasterizer:

Internal _per_env_vverts_entity_uids: set on RasterizerContext, populated via a notification hook from KinematicSolver.set_vverts (or from an entity-level method that the entity wraps).
Entities in the set use the per-env path; others stay on instancing.
Migration forward (entity first added) at the next update_rigid.

This is the design we had in v3 of the planning iteration before consolidating. It's an architecture regression (re-introduces tracking we just eliminated) but a clean perf escape hatch if needed.

github-actions · 2026-05-14T00:08:23Z