-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing weight clipping #1
Comments
Wow, that is a much much nicer implementation! Thanks so much! My code now easily runs 20-50x faster. I've added the fix to the repo. I haven't trained a good model for MNIST, because honestly I think that MNIST is too simple to really show whether an image generation technique is good or not. I'm still running my code on a self collected dataset of ImageNet-like images. We'll see how it goes. |
Very interesting! The paper said WGANs should be able to avoid this kind of mode collapse if I remember correctly, so this is definitely worth investigating. I'm going to pause my high res experiment for a while and run some tests on MNIST, CelebA and CIFAR, if I can find the time. Might take me a day or two to get representative results. |
Ah, those are some nice results! Did you find any culprit hyperparameter or was the network just under trained? I was curious at the report that a simple MLP architecture could lead to good results using WGAN, so I ran it on MNIST, take a look: I'm pretty impressed by the quality, fully connected neural nets have a bad reputation nowadays, and training only took a few hours on my consumer grade computer. But for some reason it seems dead set on occasionally producing totally black images, I'm not quite sure why. I'm probably going to try it on CIFAR next, see what happens. |
Hmm yea the results aren't bad but they aren't significantly better than a normal GAN, AAE+GAN does look much nicer. I am beginning to wonder where WGAN has larger benefits. From what I understand (and I am not the greatest expert) it may be useful in stabilizing training in difficult domains, so maybe a test on ImageNet or similar with a larger DCGAN architecture would show its benefits. I was quite surprised it got my tiny MLP model to make decent enough results, but as said before MNIST is a very simple dataset. More testing tomorrow. The MLP model as per the original paper actually doesn't use batch norm, so that isn't it. I might look into it a little more tomorrow. |
In tensorflow I just do this for weights clipping:
t_vars = tf.trainable_variables()
critic_vars = [var for var in t_vars if 'crit' in var.name]
self.clip_critic = []
for var in critic_vars:
self.clip_critic.append(tf.assign(var, tf.clip_by_value(var, -0.1, 0.1)))
Here is my repo where I try to implement WGAN: https://github.com/PatrykChrabaszcz/WGan

Did you get any good results ?
This is what I get for mnist:
The text was updated successfully, but these errors were encountered: