Merge DSBenchEvaluator into MathEvaluator

DSBenchEvaluator (nemo_skills/evaluation/evaluator/dsbench.py) is currently a subclass of MathEvaluator that adds a relaxed_equal fallback that handles MCQ, dict, and list answer types. The relaxed_equal logic is general enough to be useful beyond DSBench (e.g. any benchmark with MCQ or structured answers), and there's no strong reason to gate it behind a separate evaluator class. 

Proposed change:
  - Add relaxed_comparison as an option to MathEvaluatorConfig - either as default or by overloading relaxed_extraction config 
  - Apply the relaxed_equal fallback in MathEvaluator.eval_single when relaxed_comparison=True
  - This would remove the need for DSBenchEvaluator entirely - so can dsbench.py and update __init__.py and DSBench dataset config accordingly

This keeps the evaluator hierarchy flat and makes the relaxed comparison logic reusable for other benchmarks that have MCQ or structured (dict/list) answers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge DSBenchEvaluator into MathEvaluator #1268

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Merge DSBenchEvaluator into MathEvaluator #1268

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions