You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From experience, I can say that the multiprocessing overhead when fitting dispersion models seems to be a lot larger than the code is currently written for.
ie. fitting the last 50 or so models still takes a long time and as soon as multiprocessing is switched off for the last models, things become a lot faster. maybe multiprocessing could be only used when there are more than 10x as many genes left than processors here:
To provide some numbers: on 8 cores the last iteration where multiprocessing is used (fitting like 9 or 10 genes) takes 16s, the next iteration (no multiprocessing, so 7-8 genes) takes 2s
From experience, I can say that the multiprocessing overhead when fitting dispersion models seems to be a lot larger than the code is currently written for.
ie. fitting the last 50 or so models still takes a long time and as soon as multiprocessing is switched off for the last models, things become a lot faster. maybe multiprocessing could be only used when there are more than 10x as many genes left than processors here:
batchglm/batchglm/train/numpy/base_glm/estimator.py
Line 463 in 31b905b
So something like:
if nproc > 1 and len(idx_update) > 10 * nproc:
The text was updated successfully, but these errors were encountered: