-
Notifications
You must be signed in to change notification settings - Fork 247
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
The motivation behind these kind of "datasets" is very odd imo.
Why not just enforce a single dataset class - and call it a day! Anyone can write a simple script to download a dataset from HF and convert it to open-ai format, right? Also, there is almost zero usecase where you just take one dataset and train on it (at least to build high quality aligned models) its always a combination of a huge set of datasets, each with different formats etc. This is all data-munging work that a toolkit should not be entangled in.
Premature notions of "convenience" just end up being code debt.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested