Skip to content

FFHQ Training #40

@KomputerMaster64

Description

@KomputerMaster64

Thank you for sharing the implementation of the DDGAN model. I am trying to train the model on FFHQ 256x256 dataset. I used the NVLabs/NVAE repository for the dataset preparation. I have the file structure as follows:

image

To use another dataset similar to the CelebA-HQ 256x256, I modified the train function given in the line 190 of the train_ddgan.py file.

    elif args.dataset == 'ffhq_256':
        train_transform = transforms.Compose([
                transforms.Resize(args.image_size),
                transforms.RandomHorizontalFlip(),
                transforms.ToTensor(),
                transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))
            ])
        dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)


My implementation for the DDGAN uses 4 NVIDIA GTX 1080ti GPUs with a total batch size of 32 for training the CelebA-HQ 256x256 dataset

(--batch_size 8 and --num_process_per_node 4)

I use the following command for training!python3 train_ddgan.py --dataset ffhq_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2 --batch_size 8 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. --z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10 --num_process_per_node 4 --save_content



I am getting the following output message:

Node rank 0, local proc 0, global proc 0
Node rank 0, local proc 1, global proc 1
Node rank 0, local proc 2, global proc 2
Node rank 0, local proc 3, global proc 3
Process Process-4:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-1:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-3:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions