-
Notifications
You must be signed in to change notification settings - Fork 412
Pull requests: THUDM/slime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add support for per token reward/advantages with a custom_reward_post_process_path
#1389
opened Jan 12, 2026 by
vpj
Loading…
Fix: fix double-trim bug in entropy computation for last CP rank
#1377
opened Jan 10, 2026 by
Beichen-Ma
Loading…
Fix: Apply loss mask to KL in REINFORCE++ returns calculation
#1372
opened Jan 9, 2026 by
kaysonyu
Loading…
[release] bump to v0.2.2
release
run-ci-ckpt
run-ci-fsdp
run-ci-megatron
#1345
opened Jan 6, 2026 by
zhuzilin
Loading…
[Fix] Update deprecated sglang ep args in docs and scripts
#1344
opened Jan 6, 2026 by
coding-famer
Loading…
[Feature] Add rollout concurrency argument for full async training
#1310
opened Jan 3, 2026 by
yitianlian
Loading…
Feat(router): add oai interface support for router
#1203
opened Dec 24, 2025 by
ChangyiYang
Loading…
[FEATURE] Add tool call support for multi-turn SFT with delta-based loss masking
#1159
opened Dec 20, 2025 by
Surya-Gunukula
Loading…
tau-bench: offline stub user + tool parsing fallback
#1158
opened Dec 19, 2025 by
Fengzdadi
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.