Parallelization avenues

There are multiple layers of parallelization, some of which we need to reign in and others we should try to enable.

- Any conditional method using `ARFSampler` internally will 
    - be subject to `ranger`'s default parallelization behavior, which might be undesirable (and we might just fix it to 1 thread?)
    - probably want to use `arf`'s parallelization (which is done via `foreach`, but using `doFuture` is an option)

- Batched predictions in e.g. SAGE could benefit from parallelization, but that would need to be balanced carefully.
     - The point of the batching is to avoid excessive RAM usage by predicting on _all_ coalitions' data at once
     - Splitting that data into `k` chunks and parallelizing over it might just defeat the purpose and create additional overhead on top
     - Reasonable `batch_size` is probably learner-dependent anyway?
 

- Some operations are embarrassingly parallel, e.g. repeated operations over `iter_perm` permutations in PFI and friends, or generally repeated operations over resampling operations. 
   - Since we use `mlr3::resample()` for initial models for reference in most methods, we need to be careful about setting up a `future::plan()` because I assume mlr3 will then use it for the `resample` bit while we might want to use it for a later step instead (or in addition).

Computing is hard.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallelization avenues #25

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Parallelization avenues #25

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions