-
-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datasize < batch size? #178
Comments
@OscarDPan the idea is that you might not want to train if you don't have enough data to fill a single batch in the first place. We're using I know that this led to errors at some point, but maybe this check isn't needed anymore. Would you be willing to write a test with a dataset that violates the condition and see that training would still run without the if statement? |
Will do. Thanks. |
So I was contemplating whether I should remove the check if there's some historical reason ("this led to errors at some point"), and it makes sense that batch size should not be greater than number of samples. It took me awhile to realized this if-statement after trying to run unittest with huge batch size and model wasn't updated at all. I wanted to ask you guys what would be the best approach:
|
I'm in favor of 3 for sure - it would be great to use a logger in several places in the code instead of |
@OscarDPan if there's no need for the if-clause, go for 1., else please go for 3. btw, just to get this conceptually clear, the problem is not "huge batchsizes", but rather small, left-over batches at the end of your training set. I.e. think of a data set of length 10005 and batch-size 100. this gives you a batch of 5 at the end of each epoch. Given that we distribute data, let's say across 10 nodes, there will 100% be nodes that don't get any data for training. I haven't tested if we can handle this scenario properly, but this was an issue at some point (maybe gone with later keras versions). tl;dr: check if we get errors when training with "no data". |
Hi @maxpumperla do you mind give an explanation why you placed this if-statement in the first place? (cc @danielenricocahall )
https://github.com/maxpumperla/elephas/blob/master/elephas/worker.py#L107
https://github.com/maxpumperla/elephas/blob/master/elephas/worker.py#L116
Does it matter if my batch size is bigger than the available training data?
The text was updated successfully, but these errors were encountered: