-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Idea:
- With brickster&DBI I can use dplyr to write SQL queries. That's great!
- With a brickster&future I can use purrr/furrr to parallellize loops on databricks. Let's do this!
The future API is a well designed API to distribute R work among workers. It's nice because it allows using furrr (equivalent to purrr) but with futures, it supports nice progress bars, and it is performant and robust. It's been around for years and it is very well maintained.
https://future.futureverse.org/
Future separates "what to parallelize" which is defined by a package developer (e.g. "As a pkg developer I want this expensive computation to run in parallel") from "how to parallelize" (e.g. As a package user I want "that parallelization to use 4 cores from my laptop" or "to run this heavy thing in a Slurm cluster of computers").
It provides documentation on how to define a new "future backend"
I'd love to have a future backend that sends heavy calculations to a databricks cluster.