Skip to content

Actions: NVIDIA-NeMo/RL

Actions

Create PR to main with cherry-pick from release

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
748 workflow runs
748 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

feat: Remove do_not_average_loss (#1988)
Create PR to main with cherry-pick from release #748: Commit 84bede0 pushed by terrykong
15s main
chore: upgrade wandb to 0.25+ (#1979)
Create PR to main with cherry-pick from release #747: Commit c40dba3 pushed by terrykong
13s main
fix: add mask seq with high logp err to nemo gym config (#1980)
Create PR to main with cherry-pick from release #746: Commit 02febf1 pushed by terrykong
12s main
docs: update features.md to reflect v0.5 release and v0.6 roadmap (#1…
Create PR to main with cherry-pick from release #745: Commit bdc967c pushed by terrykong
13s main
fix: speedup minimize and minimize-check in config_cli (#1964)
Create PR to main with cherry-pick from release #744: Commit bb4825a pushed by terrykong
16s main
ci: Update release-docs workflow to use FW-CI-templates v0.72.0 (#1965)
Create PR to main with cherry-pick from release #743: Commit 0955329 pushed by chtruong814
14s main
feat: ProRLv2 - add seq-mask-tis truncated importance sampling type (…
Create PR to main with cherry-pick from release #742: Commit 2841fef pushed by terrykong
11s main
feat: Mask sequences with high logprob error (#1838)
Create PR to main with cherry-pick from release #741: Commit 14f9f38 pushed by terrykong
15s main
fix: async llm engine didnt have get_metrics() (#1943)
Create PR to main with cherry-pick from release #740: Commit 0e0edcf pushed by terrykong
13s main
feat: Support build custom flashinfer (#1886)
Create PR to main with cherry-pick from release #739: Commit 09f5ffe pushed by terrykong
13s main
feat: retry rollout if generation_logprobs contains NaN (#1885)
Create PR to main with cherry-pick from release #738: Commit 9cfe54b pushed by terrykong
14s main
docs: Document Gym + RL integration design (#1762)
Create PR to main with cherry-pick from release #737: Commit 869b5e5 pushed by terrykong
15s main
feat: refactor mcore train/forward utilities (#1654)
Create PR to main with cherry-pick from release #736: Commit 58f7c4c pushed by yuki-97
17s main
chore: bump mcore and mbridge (#1902)
Create PR to main with cherry-pick from release #735: Commit 8ef0de9 pushed by terrykong
16s main
fix: Update sglang source (#1926)
Create PR to main with cherry-pick from release #734: Commit 2d453b3 pushed by terrykong
14s main
fix: use seq_length instead of padded_seq_length for topk output padd…
Create PR to main with cherry-pick from release #733: Commit c51c0bf pushed by terrykong
13s main
fix: Mxfp8 training fix sequence padding (#1884)
Create PR to main with cherry-pick from release #732: Commit 336803f pushed by terrykong
14s main
feat: start nemo gym and other environments with cached venvs (#1927)
Create PR to main with cherry-pick from release #731: Commit 79b672b pushed by terrykong
14s main
fix: fix and re-enable rm env functional test (#1905)
Create PR to main with cherry-pick from release #730: Commit 462f504 pushed by yuki-97
13s main
fix: add missing functional test (#1883)
Create PR to main with cherry-pick from release #729: Commit d7b0e6a pushed by terrykong
11s main
fix: Fix DCP-to-HF conversion for model-wrapped checkpoints (#1881)
Create PR to main with cherry-pick from release #728: Commit 314c272 pushed by yuki-97
17s main
chore: Centralize OmegaConf resolver registration (#1882)
Create PR to main with cherry-pick from release #727: Commit 2196f40 pushed by yuki-97
16s main
fix: fix enable_seq_packing and apply_temperature_scaling in DTensor …
Create PR to main with cherry-pick from release #726: Commit ed718b4 pushed by terrykong
14s main
feat: improve dataset (#1893)
Create PR to main with cherry-pick from release #725: Commit 87a98fc pushed by yuki-97
12s main
feat: unify nemogym dataset (#1807)
Create PR to main with cherry-pick from release #724: Commit f93b56a pushed by yuki-97
15s main