We were using https://github.com/NVIDIA-NeMo/Nemotron/blob/dev/src/nemotron/recipes/nano3/stage1_sft/config/data_prep/data_blend_raw.json for doing full SFT on a CPTed model.
Data Prep was completed, .npy files were created successfully. Training is crashing at the data loading stage possibly due to OOM.