Skip to content

feat(eval): add zero-shot text classification evaluator#387

Merged
jeon185 merged 2 commits into
mainfrom
feat/eval-zero-shot-classification
Apr 30, 2026
Merged

feat(eval): add zero-shot text classification evaluator#387
jeon185 merged 2 commits into
mainfrom
feat/eval-zero-shot-classification

Conversation

@jeon185

@jeon185 jeon185 commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Resolves #325.

Adds WinMLZeroShotClassificationEvaluator, registered under the zero-shot-classification task, with a pipeline subclass that pads to max_length for static-shape ONNX.

Accuracy and macro-F1 are computed via a new ClassificationMetric, since HF evaluate has no wrapper for this task.

Default dataset is AG News; an E2E entry for MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli is included.

Unit tests cover the evaluator and the metric. An integration test (slow/network) runs end-to-end on DistilBERT, RoBERTa, and DeBERTa NLI checkpoints.

@jeon185

jeon185 commented Apr 23, 2026

Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree [company="Microsoft"]

@jeon185 jeon185 force-pushed the feat/eval-zero-shot-classification branch from 14af807 to 96d9ca1 Compare April 23, 2026 22:05
Comment thread scripts/e2e_eval/testsets/models_with_acc.json Outdated
Comment thread src/winml/modelkit/eval/zero_shot_classification_evaluator.py
Comment thread src/winml/modelkit/eval/zero_shot_classification_evaluator.py Outdated
Comment thread src/winml/modelkit/eval/zero_shot_classification_evaluator.py
Comment thread scripts/e2e_eval/testsets/models_with_acc.json Outdated
@jeon185 jeon185 force-pushed the feat/eval-zero-shot-classification branch from 96d9ca1 to 0d93fb9 Compare April 27, 2026 23:28
@jeon185 jeon185 force-pushed the feat/eval-zero-shot-classification branch from 0d93fb9 to 7828bfc Compare April 28, 2026 19:41

@zhenchaoni zhenchaoni left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the code and validated locally on my device

@jeon185 jeon185 enabled auto-merge (squash) April 29, 2026 20:30
@jeon185 jeon185 merged commit e2ca152 into main Apr 30, 2026
9 checks passed
@jeon185 jeon185 deleted the feat/eval-zero-shot-classification branch April 30, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: support zero-shot-classification model evaluation (DeBERTa, RoBERTa, DistilBERT)

2 participants