Skip to content

e2e_eval: dynamic tasks, priority/group cascade, --priority list#393

Merged
xieofxie merged 9 commits into
mainfrom
hualxie/update_list
Apr 27, 2026
Merged

e2e_eval: dynamic tasks, priority/group cascade, --priority list#393
xieofxie merged 9 commits into
mainfrom
hualxie/update_list

Conversation

@xieofxie

@xieofxie xieofxie commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Summary

build_registry.py

  • Fetch pipeline tasks dynamically from HuggingFace /api/tasks (47 tasks) instead of the 21-entry hand-written NLP_TASKS + CV_TASKS lists.
  • 3-level cascade for priority and group when classifying entries:
    1. Curatedmodels_curated.json overrides win verbatim (group, priority both honored — previously priority was forced to P0).
    2. Existing — if (hf_id, task) was in the previous models_all.json, inherit that row's priority/group. Lets hand-edits to models_all.json survive rebuilds.
    3. New-rule defaultsmicrosoft/-prefixed → (microsoft, P1); everything else → (Top200, P2). (P2 is a new tier; optimum_supported no longer drives priority.)
  • Phase 1.5 preservation: existing (hf_id, task) rows that aren't re-picked by Phase 1 are kept verbatim except downloads + optimum_supported are refreshed. Lets the registry stay stable across runs and supports tasks HF's API doesn't expose (e.g. masked-lm).
  • Per-task order field: every entry now carries order: 1..N ranked by downloads descending so consumers don't need to re-sort.

run_eval.py

  • --priority now accepts multiple values — e.g. --priority P0 P1 runs the union. Backward-compatible with single value.
  • filter_registry() widened: priority: str | Sequence[str] | None.

Curated list

  • Add 4 ISV models to models_curated.json.

Regenerated models_all.json (439 entries, sorted by (hf_id, task))

  • Priority: P0: 22 | P2: 202 | P3: 215
  • Groups: Top200: 398 | Foundry Toolkit: 16 | microsoft: 15 | Benchmark: 6 | ISV: 4
  • Optimum-supported: 311/439

@xieofxie xieofxie requested a review from a team as a code owner April 24, 2026 08:19
hualxie and others added 5 commits April 27, 2026 10:25
@xieofxie xieofxie changed the title e2e_eval: fetch tasks from HF API, add P2/order/preservation e2e_eval: dynamic tasks, priority/group cascade, --priority list Apr 27, 2026
Co-authored-by: Copilot <copilot@github.com>
Comment thread scripts/e2e_eval/testsets/models_curated.json Outdated
Comment thread scripts/e2e_eval/run_eval.py
@xieofxie xieofxie merged commit 225edc6 into main Apr 27, 2026
9 checks passed
@xieofxie xieofxie deleted the hualxie/update_list branch April 27, 2026 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants