Skip to content

packing + CP #1473

@olisicky

Description

@olisicky

Hi. I would like to ask if the packing + CP is already planned as it is not possible yet.

Is your feature request related to a problem? Please describe.
Packing is a way of how to test or fill data for long context training. It seems that long context training with limited number of nodes is not possible yet.

Describe the solution you'd like
It would be amazing to combine packing process defined here with CP to enable long context training.

Describe alternatives you've considered
I tried combinations without CP but was not successful. The MegatronPretraining under the dataset can create long-context samples but it seems not supported by the CP>1.

Additional context

Thank you very much! I did lots of experiments with NeMo2 and the separation into Automodel for pre-training and SFT is great idea.

Ondřej

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions