Hello ! I'm currently building a custom model that is similar to a decoder-only architecture (GPT-2 style not Llama).
I'm wondering if there's the easiest way to train my custom model using the TRI-ML/DCLM-1B dataset and training script!
I appreciate your help.