-
-
Notifications
You must be signed in to change notification settings - Fork 1
Parallelization avenues #25
Copy link
Copy link
Open
Labels
performanceDoesn't make it more correct but faster or less memory hungryDoesn't make it more correct but faster or less memory hungryquestionUnclear how to proceed without further info / discussionUnclear how to proceed without further info / discussion
Metadata
Metadata
Assignees
Labels
performanceDoesn't make it more correct but faster or less memory hungryDoesn't make it more correct but faster or less memory hungryquestionUnclear how to proceed without further info / discussionUnclear how to proceed without further info / discussion
Type
Fields
Give feedbackNo fields configured for issues without a type.
There are multiple layers of parallelization, some of which we need to reign in and others we should try to enable.
Any conditional method using
ARFSamplerinternally willranger's default parallelization behavior, which might be undesirable (and we might just fix it to 1 thread?)arf's parallelization (which is done viaforeach, but usingdoFutureis an option)Batched predictions in e.g. SAGE could benefit from parallelization, but that would need to be balanced carefully.
kchunks and parallelizing over it might just defeat the purpose and create additional overhead on topbatch_sizeis probably learner-dependent anyway?Some operations are embarrassingly parallel, e.g. repeated operations over
iter_permpermutations in PFI and friends, or generally repeated operations over resampling operations.mlr3::resample()for initial models for reference in most methods, we need to be careful about setting up afuture::plan()because I assume mlr3 will then use it for theresamplebit while we might want to use it for a later step instead (or in addition).Computing is hard.