Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you share the hyper-parameters you used in training? I want to do a simple replay. #6

Open
GuangtaoLyu opened this issue Apr 11, 2024 · 1 comment

Comments

@GuangtaoLyu
Copy link

Hello,
Can you share the hyper-parameters you used in training? I want to do a simple replay.
thank you.

@exitudio
Copy link
Owner

Hi,
The default parameters are the parameters we used for training. Except the batch size, we use 512 since we use 4 gpus for training therefore we decrease total-iter, lr-scheduler, and eval-iter by 4 proportional to the increasing batch size. Learning rate is still the same. Here is the batch script we use.

name='HML3D_45_crsAtt1lyr_20breset' 
vq_name='2023-07-19-04-17-17_12_VQVAE_20batchResetNRandom_8192_32'
export CUDA_VISIBLE_DEVICES=0,1,2,3
MULTI_BATCH=4

python3 train_t2m_trans.py  \
    --exp-name ${name} \
    --batch-size $((128*MULTI_BATCH)) \
    --vq-name ${vq_name} \
    --out-dir output/${dataset_name} \
    --total-iter $((300000/MULTI_BATCH)) \
    --lr-scheduler $((150000/MULTI_BATCH)) \
    --dataname t2m \
    --eval-iter $((20000/MULTI_BATCH))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants