feat(keypoint-detection): enable ViTPose config/build/perf by jeon185 · Pull Request #905 · microsoft/winml-cli

jeon185 · 2026-06-16T18:19:42Z

Enables the 6 ViTPose keypoint-detection models from #284 to pass wmk config -> build -> perf (CPU and OpenVINO). Eval isn't included here - I'll do the accuracy side in a follow-up since it needs a couple of design decisions first.

Two things were blocking all 6 models:

wmk config failed with "Task 'keypoint-detection' not supported by TasksManager". Optimum has the ViTPose ONNX export config but no task->class entry for keypoint-detection, and AutoModelForKeypointDetection only covers SuperPoint. Added the (vitpose, keypoint-detection) -> VitPoseForPoseEstimation mapping, same way we already do it for CLIP/SAM.
The plus checkpoints (MoE backbone) crashed during export with "dataset_index must be provided when using multiple experts". Optimum's VitPoseModelPatcher injects a constant dataset_index, but patch_model_for_export defaults model_kwargs to None so it crashed on init. Passing an explicit model_kwargs={} fixes that. The trace step (Step 3) was also running the model outside the patcher context, so I wrapped it the same way the export step already is.

The exporter change isn't ViTPose-specific - it helps any MoE model whose patcher injects forward args.

Verified config/build/perf on all 6: vitpose-base-simple, vitpose-plus-small/base/large/huge, and synthpose-vitpose-huge-hf. Added unit tests for the mapping and the patcher model_kwargs handling.

One note: you still need to pass --task keypoint-detection explicitly for now - the task isn't auto-detected from the config yet. I left auto-detection out of this PR to keep it small; can add it here or as a follow-up if you'd prefer.

Refs #284.

ViTPose keypoint-detection models could not pass the wmk pipeline: 1. Task resolution: Optimum registers the ViTPose ONNX export config but has no task-to-class entry for keypoint-detection, and transformers' AutoModelForKeypointDetection only recognizes SuperPoint. Add MODEL_CLASS_MAPPING[(vitpose, keypoint-detection)] = VitPoseForPoseEstimation (models/hf/vitpose.py) so the resolver loads the correct class. 2. MoE export: the vitpose-plus checkpoints use a Mixture-of-Experts backbone whose patcher injects a constant dataset_index at export time. Optimum's patch_model_for_export defaults model_kwargs to None, so the patcher crashed on init. Pass an explicit model_kwargs={} in _get_optimum_patcher. Also wrap the Step 3 hierarchy trace in the same patcher context (it previously ran the model forward without the injected dataset_index, failing before export). Verified config -> build -> perf on all 6 acceptance models in #284 (base-simple, plus-{small,base,large,huge}, synthpose-vitpose-huge-hf).

zhenchaoni · 2026-06-24T08:07:51Z

+            # are traced with the same inputs they are exported with. The export
+            # in Step 4 re-enters the patcher; the contexts are sequential, not
+            # nested.
+            with self._get_optimum_patcher(model, task):


Change looks good and low-risk overall. The model_kwargs={} fix is effectively a no-op for non-MoE patchers (Optimum's ModelPatcher.__init__ already coerces None → {}), so that one's safe.

My one concern is wrapping Step 3 (_trace_model_hierarchy) in the patcher. Models that already resolve a real Optimum patcher today — e.g. CLIP (clip_text_model / clip_vision_model), SAM, T5, SigLIP, whisper, VED — will now have their hierarchy trace run through patched forward for the first time. The ONNX graph is unaffected (export already ran patched), but the traced module path can shift, which could change the hierarchy / tag coverage on those models.

You verified the 6 ViTPose models, but those weren't being traced-under-patch before. Could you also run a before/after on at least one already-patched non-ViTPose model (CLIP is a good pick) and confirm the tag coverage / hierarchy stats are unchanged?

vortex-captain · 2026-06-24T08:37:43Z

+
+# (model_type, task) -> HuggingFace model class
+MODEL_CLASS_MAPPING: dict[tuple[str, str], type] = {
+    ("vitpose", "keypoint-detection"): VitPoseForPoseEstimation,


Suggested change

("vitpose", "keypoint-detection"): VitPoseForPoseEstimation,

("vitpose", "keypoint-detection"): VitPoseForPoseEstimation,

("vitpose", None): VitPoseForPoseEstimation,

Could you help try test to see this line makes command with task omitted work?

jeon185 requested a review from a team as a code owner June 16, 2026 18:19

jeon185 force-pushed the feat/keypoint-detection-enablement branch from 2007caf to 540ab05 Compare June 23, 2026 18:35

jeon185 mentioned this pull request Jun 23, 2026

feat(keypoint-detection): add COCO OKS-AP evaluation #949

Open

zhenchaoni reviewed Jun 24, 2026

View reviewed changes

vortex-captain reviewed Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(keypoint-detection): enable ViTPose config/build/perf#905

feat(keypoint-detection): enable ViTPose config/build/perf#905
jeon185 wants to merge 1 commit into
mainfrom
feat/keypoint-detection-enablement

jeon185 commented Jun 16, 2026 •

edited

Loading

Uh oh!

zhenchaoni Jun 24, 2026

Uh oh!

vortex-captain Jun 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	("vitpose", "keypoint-detection"): VitPoseForPoseEstimation,
	("vitpose", "keypoint-detection"): VitPoseForPoseEstimation,
	("vitpose", None): VitPoseForPoseEstimation,

Conversation

jeon185 commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhenchaoni Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

vortex-captain Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jeon185 commented Jun 16, 2026 •

edited

Loading

vortex-captain Jun 24, 2026 •

edited

Loading