Releases · NVIDIA-NeMo/Evaluator

16 Apr 09:49

NVIDIA NeMo Evaluator 0.2.6

Changelog Details

fix: write BYOB results to per-benchmark subdirectory to avoid data overwriting by @laszkiewiczp :: PR: #856
fix: use normalized name in BYOB FDF evaluation entry by @laszkiewiczp :: PR: #855
feat: BYOB add output_parser parameter to judge_score() by @laszkiewiczp :: PR: #859
fix: byob readme example by @laszkiewiczp :: PR: #854
fix: remove obsolete run_eval from all by @marta-sd :: PR: #886
feat(per-sample-score): per sample score by @AWarno :: PR: #888
feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
fix: move logger creation in ProgressTrackingInterceptor to the top by @marta-sd :: PR: #900
fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882
fix: use poll() and disable IPv6 in waitress adapter server by @agronskiy :: PR: #905

marta-sd, ngoncharenko, and 3 other contributors

Assets 2

16 Apr 09:49

Latest

Changelog Details

feat: support arbitrary sbatch flags via sbatch_extra_flags by @gchlebus :: PR: #864
feat(extra-params): export extra params by @AWarno :: PR: #873
docs: skill cleanups and fixes by @piojanu :: PR: #878
docs: add auxiliary deployments example and documentation by @AdamRajfer :: PR: #875
feat: allow duplicate task names in nel by @laszkiewiczp :: PR: #874
fix: add missing task_idx arg to TestSbatchExtraFlags by @laszkiewiczp :: PR: #885
feat: syntactic sugar overrides for tasks by @anowaczynski-nvidia :: PR: #759
feat: add watch mode for continuous checkpoint evaluation by @marta-sd :: PR: #857
feat: expose invocation ID as NEL_INVOCATION_ID env var by @agronskiy :: PR: #894
feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
feat: mount results for deployment by @AdamRajfer :: PR: #899
fix: raise error when execution.env_vars is used in config by @marta-sd :: PR: #898
fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882

gchlebus, marta-sd, and 7 other contributors

Assets 2

19 Mar 08:32

Changelog Details

feat: deploy auxiliary endpoints by @wprazuch :: PR: #830
feat: add launching-evals and accessing-mlflow skills by @piojanu :: PR: #865
feat: rename to nel skills add and add marketplace entries by @piojanu :: PR: #868

piojanu and wprazuch

Assets 2

18 Mar 01:35

NVIDIA NeMo Evaluator 0.2.5

Changelog Details

feat: add --platform flag for BYOB container builds by @laszkiewiczp :: PR: #832
chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
fix: remove deprecated api_key field from ApiEndpoint by @gchlebus :: PR: #850

gchlebus, wprazuch, and laszkiewiczp

Assets 2

18 Mar 01:35

Changelog Details

docs(nemotron-3-super): reproducible configs by @prokotg :: PR: #840
docs(SKILL.md): add ARM64 and non-standard GPU compatibility note by @himorishige :: PR: #818
fix(deprecated-multiple-instances-flag): fix deprecated multiple instances by @AWarno :: PR: #838
fix(nel-assistant): correct --model-type to --model_type in SKILL.md by @himorishige :: PR: #813
feat(malformed-configs-validation): validation of malformed configs by @AWarno :: PR: #811
fix: fixes for user-reported bugs after 0.2 release by @marta-sd :: PR: #837
docs(post_cmd): add post_cmd documentation by @e-dobrowolska :: PR: #841
feat: add configurable health check timeout for local executor by @laszkiewiczp :: PR: #844
chore: Simplify launcher evaluation templates and skill guidance by @piojanu :: PR: #846
chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
chore: Update for 26.03 by @wprazuch :: PR: #852
chore: VLMEvalkit bump by @wprazuch :: PR: #853
fix: bypass unlisted-task safeguard for local .sqsh by @gchlebus :: PR: #849