fix: op-tracing preflight, eval samples merge, compile device label, HTP tagged_nodes stats by xieofxie · Pull Request #209 · microsoft/winml-cli

xieofxie · 2026-04-01T06:34:21Z

Summary

Bug A (P2) wmk perf --op-tracing: moved is_qnn_profiling_available() check to before PerfBenchmark.run() so the command fails fast instead of wasting a full benchmark run
Bug B (P2) wmk eval --samples N: replaced wholesale DatasetConfig overwrite with a merge that preserves user-specified samples, split, shuffle, and seed when falling back to a default dataset
Bug C (P3) wmk compile --ep dml: derive the displayed Device: label from _EP_TO_DEVICE[provider] instead of the raw CLI --device default (npu)
Bug D (P3) wmk export --no-hierarchy: gate tagged_nodes and coverage_percentage stats on embed_hierarchy_attributes so they report 0 when hierarchy tagging is disabled

Each bug is covered by a new regression test that failed before the fix and passes after.

- Replace fragile class-default comparison in evaluate.py with explicit_fields sentinel tracking; CLI --samples/--split/--shuffle now use None defaults - Add DatasetConfig.explicit_fields frozenset to track caller-set fields - Deduplicate tagged_nodes/coverage stats via _update_tag_stats() helper in HTPExporter; export() stats block now calls the shared helper - Fix test imports: HTPExporter from package level, DatasetConfig with explicit_fields in Bug B regression test

zhenchaoni

The change related to eval and evaluate looks good to me

tezheng · 2026-04-07T07:46:30Z

I dont quite get the default value changes here, why remove default values from cli args, and delay to the function call?

I have updated the previous bugbash report in description

Bug B — P2: wmk eval --samples N silently ignored when no --dataset is given
Command: wmk eval -m microsoft/resnet-50 --samples 20
Symptom: Output always shows 'samples': 100 regardless of --samples value.
Root cause: evaluate.py:149-155 — when config.dataset.path is None (no --dataset flag), the entire DatasetConfig is replaced wholesale with the hardcoded _DEFAULT_DATASETS entry (which has samples=100), discarding the user's value:

if config.dataset.path is None:
config.dataset = deepcopy(default) # overwrites samples, split, shuffle
Fix: Merge user-specified fields (samples, split, shuffle, seed) onto the default rather than replacing it entirely.
Location: src/winml/modelkit/eval/evaluate.py:155

The None could let user decide either user value or default config value could take effect

bug fix

ffa80c1

xieofxie requested a review from a team as a code owner April 1, 2026 06:34

xieofxie commented Apr 1, 2026

View reviewed changes

Comment thread src/winml/modelkit/export/htp/exporter.py Outdated

timenick reviewed Apr 1, 2026

View reviewed changes

Comment thread tests/unit/export/test_htp_exporter_stats.py

Comment thread src/winml/modelkit/eval/evaluate.py Outdated

timenick reviewed Apr 1, 2026

View reviewed changes

Comment thread src/winml/modelkit/export/htp/exporter.py Outdated

xieofxie commented Apr 1, 2026

View reviewed changes

Comment thread src/winml/modelkit/commands/eval.py

xieofxie commented Apr 1, 2026

View reviewed changes

Comment thread src/winml/modelkit/export/htp/exporter.py

Merge branch 'main' into hualxie/fix_bugs

f9e4cf4

zhenchaoni approved these changes Apr 7, 2026

View reviewed changes

Comment thread src/winml/modelkit/commands/eval.py

xieofxie added 2 commits April 7, 2026 14:57

Merge branch 'main' into hualxie/fix_bugs

355be3b

Merge branch 'main' into hualxie/fix_bugs

2d24a00

tezheng reviewed Apr 7, 2026

View reviewed changes

This was referenced Apr 9, 2026

fix(perf): op-tracing preflight check before benchmark run #279

Closed

fix(compile): derive Device label from resolved EP, not CLI default #280

Merged

fix(export): gate tagged_nodes/coverage stats on embed_hierarchy_attributes #281

Closed

tezheng closed this Apr 13, 2026

tezheng deleted the hualxie/fix_bugs branch April 13, 2026 14:01

This was referenced Apr 14, 2026

fix(export): gate tagged_nodes/coverage stats on embed_hierarchy_attributes #329

Merged

fix(perf): op-tracing preflight check before benchmark run #330

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: op-tracing preflight, eval samples merge, compile device label, HTP tagged_nodes stats#209

fix: op-tracing preflight, eval samples merge, compile device label, HTP tagged_nodes stats#209
xieofxie wants to merge 5 commits into
mainfrom
hualxie/fix_bugs

xieofxie commented Apr 1, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhenchaoni left a comment

Uh oh!

Uh oh!

tezheng Apr 7, 2026

Uh oh!

xieofxie Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

xieofxie commented Apr 1, 2026

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhenchaoni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tezheng Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

xieofxie Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants