Skip to content

[Question] Release timeline for full, reproducible training recipes for Nemotron models #68

@hellogdc

Description

@hellogdc

Description

Hi Nemotron Team,

First of all, thank you for releasing the weights and the excellent technical report for the Nemotron family (especially the recent Nemotron-3-Nano-30B-A3B). The transparency regarding data mixture and the WSD learning rate schedule is highly appreciated by the open-source community.

I noticed in the README of this repository that the "Training Recipes" section is marked as "Coming Soon".

Request

Could you please provide an estimated timeline or roadmap for when these full, reproducible training recipes (including synthetic data generation scripts and curation pipelines) will be made available in the src/nemotron/recipes/ directory?

Specifically, I am interested in:

  • The exact pre-training scripts used for the Nano-30B model.
  • The configuration files for the Warmup-Stable-Decay (WSD) schedule.
  • The data curation scripts used in the NeMo Curator pipeline for the 25T tokens.

Having access to these recipes would be invaluable for researchers looking to conduct continued pre-training and domain adaptation based on the Nemotron architecture.

Thank you for your time and for supporting open-source AI!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions