-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Description
Hi Nemotron Team,
First of all, thank you for releasing the weights and the excellent technical report for the Nemotron family (especially the recent Nemotron-3-Nano-30B-A3B). The transparency regarding data mixture and the WSD learning rate schedule is highly appreciated by the open-source community.
I noticed in the README of this repository that the "Training Recipes" section is marked as "Coming Soon".
Request
Could you please provide an estimated timeline or roadmap for when these full, reproducible training recipes (including synthetic data generation scripts and curation pipelines) will be made available in the src/nemotron/recipes/ directory?
Specifically, I am interested in:
- The exact pre-training scripts used for the Nano-30B model.
- The configuration files for the Warmup-Stable-Decay (WSD) schedule.
- The data curation scripts used in the NeMo Curator pipeline for the 25T tokens.
Having access to these recipes would be invaluable for researchers looking to conduct continued pre-training and domain adaptation based on the Nemotron architecture.
Thank you for your time and for supporting open-source AI!