Closed
Description
Description:
If I for instance set the pre-spawn-environ
configuration, it is not taken into account when the jobs/workers/nanny are spawned.
Minimal Complete Verifiable Example:
from distributed import Client, get_worker, print as print_dd
from dask_jobqueue import SLURMCluster
from dask import config as dd_config
NUM_THREADS = 8
def get_env(num_threads: int = NUM_THREADS) -> dict[str, str]:
return {var: f"{num_threads}" for var in ["OMP_NUM_THREADS", "MKL_NUM_THREADS", "OPENBLAS_NUM_THREADS"]}
def get_exports(env: dict[str, str]) -> list[str]:
return [f"export {var}={key}" for var, key in env.items()]
def op(ii: int) -> float:
try:
worker = get_worker()
print(f"{ii = } - {worker.name = }")
print_dd(f"{dd_config.get('distributed.nanny.pre-spawn-environ') = }")
except:
print(f"{ii = }")
if __name__ == "__main__":
dd_config.set({"distributed.nanny.pre-spawn-environ": get_env()})
with SLURMCluster(
queue="my_queue",
cores=1,
processes=1,
job_cpu=NUM_THREADS,
memory="4GB",
log_directory="tmp",
job_script_prologue=get_exports(get_env()),
) as cluster:
cluster.scale(jobs=2)
with Client(cluster) as client:
futures = [client.submit(op, ii) for ii in range(N_TRIES)]
res_d = [f.result() for f in tqdm(futures, desc=f"Distributed ({NUM_THREADS})", total=N_TRIES)]
And the output is the default:
{'MALLOC_TRIM_THRESHOLD_': 65536, 'OMP_NUM_THREADS': 1, 'MKL_NUM_THREADS': 1, 'OPENBLAS_NUM_THREADS': 1}
The job_script_prologue
seems to not have an effect either... How are we supposed to pass these variables?
Environment:
- Dask version: 2024.8.2
- Python version: 3.11
- Operating System: Ubuntu 2020.4
- Install method (conda, pip, source): conda
Metadata
Metadata
Assignees
Labels
No labels