do we need to select the validation samples carefully to guarantee the quality or just randomly split LAION-30M to train and val with ratio 29:1?