Releases: NVIDIA-NeMo/Evaluator
Releases · NVIDIA-NeMo/Evaluator
NVIDIA NeMo Evaluator 0.2.6
Changelog Details
- fix: write BYOB results to per-benchmark subdirectory to avoid data overwriting by @laszkiewiczp :: PR: #856
- fix: use normalized name in BYOB FDF evaluation entry by @laszkiewiczp :: PR: #855
- feat: BYOB add output_parser parameter to judge_score() by @laszkiewiczp :: PR: #859
- fix: byob readme example by @laszkiewiczp :: PR: #854
- fix: remove obsolete run_eval from all by @marta-sd :: PR: #886
- feat(per-sample-score): per sample score by @AWarno :: PR: #888
- feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
- fix: move logger creation in ProgressTrackingInterceptor to the top by @marta-sd :: PR: #900
- fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882
- fix: use poll() and disable IPv6 in waitress adapter server by @agronskiy :: PR: #905
NVIDIA NeMo Evaluator Launcher 0.2.5
Changelog Details
- feat: support arbitrary sbatch flags via sbatch_extra_flags by @gchlebus :: PR: #864
- feat(extra-params): export extra params by @AWarno :: PR: #873
- docs: skill cleanups and fixes by @piojanu :: PR: #878
- docs: add auxiliary deployments example and documentation by @AdamRajfer :: PR: #875
- feat: allow duplicate task names in nel by @laszkiewiczp :: PR: #874
- fix: add missing task_idx arg to TestSbatchExtraFlags by @laszkiewiczp :: PR: #885
- feat: syntactic sugar overrides for tasks by @anowaczynski-nvidia :: PR: #759
- feat: add watch mode for continuous checkpoint evaluation by @marta-sd :: PR: #857
- feat: expose invocation ID as NEL_INVOCATION_ID env var by @agronskiy :: PR: #894
- feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
- feat: mount results for deployment by @AdamRajfer :: PR: #899
- fix: raise error when execution.env_vars is used in config by @marta-sd :: PR: #898
- fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882
NVIDIA NeMo Evaluator Launcher 0.2.4
NVIDIA NeMo Evaluator 0.2.5
NVIDIA NeMo Evaluator Launcher 0.2.3
Changelog Details
- docs(nemotron-3-super): reproducible configs by @prokotg :: PR: #840
- docs(SKILL.md): add ARM64 and non-standard GPU compatibility note by @himorishige :: PR: #818
- fix(deprecated-multiple-instances-flag): fix deprecated multiple instances by @AWarno :: PR: #838
- fix(nel-assistant): correct --model-type to --model_type in SKILL.md by @himorishige :: PR: #813
- feat(malformed-configs-validation): validation of malformed configs by @AWarno :: PR: #811
- fix: fixes for user-reported bugs after 0.2 release by @marta-sd :: PR: #837
- docs(post_cmd): add post_cmd documentation by @e-dobrowolska :: PR: #841
- feat: add configurable health check timeout for local executor by @laszkiewiczp :: PR: #844
- chore: Simplify launcher evaluation templates and skill guidance by @piojanu :: PR: #846
- chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
- chore: Update for 26.03 by @wprazuch :: PR: #852
- chore: VLMEvalkit bump by @wprazuch :: PR: #853
- fix: bypass unlisted-task safeguard for local .sqsh by @gchlebus :: PR: #849