Skip to content

Releases: NVIDIA-NeMo/Evaluator

NVIDIA NeMo Evaluator 0.2.6

16 Apr 09:49
cb5e2f8

Choose a tag to compare

Changelog Details
  • fix: write BYOB results to per-benchmark subdirectory to avoid data overwriting by @laszkiewiczp :: PR: #856
  • fix: use normalized name in BYOB FDF evaluation entry by @laszkiewiczp :: PR: #855
  • feat: BYOB add output_parser parameter to judge_score() by @laszkiewiczp :: PR: #859
  • fix: byob readme example by @laszkiewiczp :: PR: #854
  • fix: remove obsolete run_eval from all by @marta-sd :: PR: #886
  • feat(per-sample-score): per sample score by @AWarno :: PR: #888
  • feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
  • fix: move logger creation in ProgressTrackingInterceptor to the top by @marta-sd :: PR: #900
  • fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882
  • fix: use poll() and disable IPv6 in waitress adapter server by @agronskiy :: PR: #905

NVIDIA NeMo Evaluator Launcher 0.2.5

16 Apr 09:49
cb5e2f8

Choose a tag to compare

Changelog Details
  • feat: support arbitrary sbatch flags via sbatch_extra_flags by @gchlebus :: PR: #864
  • feat(extra-params): export extra params by @AWarno :: PR: #873
  • docs: skill cleanups and fixes by @piojanu :: PR: #878
  • docs: add auxiliary deployments example and documentation by @AdamRajfer :: PR: #875
  • feat: allow duplicate task names in nel by @laszkiewiczp :: PR: #874
  • fix: add missing task_idx arg to TestSbatchExtraFlags by @laszkiewiczp :: PR: #885
  • feat: syntactic sugar overrides for tasks by @anowaczynski-nvidia :: PR: #759
  • feat: add watch mode for continuous checkpoint evaluation by @marta-sd :: PR: #857
  • feat: expose invocation ID as NEL_INVOCATION_ID env var by @agronskiy :: PR: #894
  • feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
  • feat: mount results for deployment by @AdamRajfer :: PR: #899
  • fix: raise error when execution.env_vars is used in config by @marta-sd :: PR: #898
  • fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882

NVIDIA NeMo Evaluator Launcher 0.2.4

19 Mar 08:32
26f45ea

Choose a tag to compare

Changelog Details
  • feat: deploy auxiliary endpoints by @wprazuch :: PR: #830
  • feat: add launching-evals and accessing-mlflow skills by @piojanu :: PR: #865
  • feat: rename to nel skills add and add marketplace entries by @piojanu :: PR: #868

NVIDIA NeMo Evaluator 0.2.5

18 Mar 01:35
a8c6072

Choose a tag to compare

Changelog Details
  • feat: add --platform flag for BYOB container builds by @laszkiewiczp :: PR: #832
  • chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
  • fix: remove deprecated api_key field from ApiEndpoint by @gchlebus :: PR: #850

NVIDIA NeMo Evaluator Launcher 0.2.3

18 Mar 01:35
a8c6072

Choose a tag to compare

Changelog Details
  • docs(nemotron-3-super): reproducible configs by @prokotg :: PR: #840
  • docs(SKILL.md): add ARM64 and non-standard GPU compatibility note by @himorishige :: PR: #818
  • fix(deprecated-multiple-instances-flag): fix deprecated multiple instances by @AWarno :: PR: #838
  • fix(nel-assistant): correct --model-type to --model_type in SKILL.md by @himorishige :: PR: #813
  • feat(malformed-configs-validation): validation of malformed configs by @AWarno :: PR: #811
  • fix: fixes for user-reported bugs after 0.2 release by @marta-sd :: PR: #837
  • docs(post_cmd): add post_cmd documentation by @e-dobrowolska :: PR: #841
  • feat: add configurable health check timeout for local executor by @laszkiewiczp :: PR: #844
  • chore: Simplify launcher evaluation templates and skill guidance by @piojanu :: PR: #846
  • chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
  • chore: Update for 26.03 by @wprazuch :: PR: #852
  • chore: VLMEvalkit bump by @wprazuch :: PR: #853
  • fix: bypass unlisted-task safeguard for local .sqsh by @gchlebus :: PR: #849

NVIDIA NeMo Evaluator 0.2.4

12 Mar 14:40
7e23b22

Choose a tag to compare

Changelog Details

NVIDIA NeMo Evaluator 0.2.3

11 Mar 01:37
98bafd1

Choose a tag to compare

Changelog Details

NVIDIA NeMo Evaluator Launcher 0.2.2

11 Mar 01:37
98bafd1

Choose a tag to compare

NVIDIA NeMo Evaluator 0.2.2

10 Mar 01:37
fbaa80a

Choose a tag to compare

Changelog Details

NVIDIA NeMo Evaluator 0.2.1

09 Mar 11:30
d8c75e2

Choose a tag to compare

Changelog Details
  • chore: improve test payloads for capabilities and logs for non-image media types by @marta-sd :: PR: #824