-
Notifications
You must be signed in to change notification settings - Fork 59
Open
Description
hello,
I find that I almost never am able to take advantage of all available workers on a cluster. Here I have 20 workers, but the xnes algorithm only ever uses 13. I also tried with dxnes but that uses 14 out of 20. Is there anything in the algo that determines how many workers to use?
i tried this:
[oswald@stella01 git]$ git clone [email protected]:robertfeldt/BlackBoxOptim.jl.git
Cloning into 'BlackBoxOptim.jl'...
Warning: the ECDSA host key for 'github.com' differs from the key for the IP address '140.82.121.4'
Offending key for IP in /gpfs/users/oswald/.ssh/known_hosts:1
Matching host key in /gpfs/users/oswald/.ssh/known_hosts:9
remote: Enumerating objects: 7111, done.
remote: Counting objects: 100% (351/351), done.
remote: Compressing objects: 100% (183/183), done.
remote: Total 7111 (delta 187), reused 306 (delta 165), pack-reused 6760
Receiving objects: 100% (7111/7111), 2.01 MiB | 2.05 MiB/s, done.
Resolving deltas: 100% (5031/5031), done.
[oswald@stella01 git]$ vim BlackBoxOptim.jl/examples/rosenbrock_parallel.jl
[oswald@stella01 BlackBoxOptim.jl]$ cat examples/rosenbrock_parallel.jl
using Distributed
# Now add 2 procs that can exec in parallel (obviously it depends on your CPU
# what you actually gain from this though)
addprocs(20, exeflags = "--project=.")
@everywhere using Pkg
@everywhere Pkg.instantiate()
# Ensure BlackBoxOptim loaded on all workers
@everywhere using BlackBoxOptim
# define the function to optimize on all workers. Parallel eval only gives a gain
# if function to optimize is slow. For this example we introduce a fake sleep
# to make it slow since the function is actually very quick to eval...
@everywhere function slow_rosenbrock(x)
sleep(1) # Fake a slower func to be optimized...
println("evaluation on worker")
return BlackBoxOptim.rosenbrock(x)
end
# First run without any parallel procs used in eval
#opt1 = bbsetup(slow_rosenbrock; Method=:xnes, SearchRange = (-5.0, 5.0),
# NumDimensions = 50, MaxFuncEvals = 5000)
#el1 = @elapsed res1 = bboptimize(opt1)
#t1 = round(el1, digits=3)
# When Workers= option is given, BlackBoxOptim enables parallel
# evaluation of fitness using the specified worker processes
opt2 = bbsetup(slow_rosenbrock; Method=:xnes, SearchRange = (-5.0, 5.0),
NumDimensions = 50, MaxFuncEvals = 40, Workers = workers())
el2 = @elapsed res2 = bboptimize(opt2)
t2 = round(el2, digits=3)
#println("Time: serial = $(t1)s, parallel = $(t2)s")
#if t2 < t1
# println("Speedup is $(round(t1/t2, digits=1))x")
#else
# println("Slowdown is $(round(t2/t1, digits=1))x")
#end
[oswald@stella01 BlackBoxOptim.jl]$ cat dante.run
#!/bin/bash
#SBATCH --job-name=rosenbrock
#SBATCH --output=rosen.out
#SBATCH --error=rosen.err
#SBATCH --partition=ncpushort
#SBATCH --nodes=1
#SBATCH --cpus-per-task=20
#SBATCH --mem-per-cpu=1G # memory per cpu-core
julia --project=. -e 'using Pkg; Pkg.instantiate(); include("examples/rosenbrock_parallel.jl")'
[oswald@stella01 BlackBoxOptim.jl]$ sbatch dante.run
Submitted batch job 1030059
[oswald@stella01 BlackBoxOptim.jl]$ cat rosen.err
The latest version of Julia in the `release` channel is 1.10.2+0.x64.linux.gnu. You currently have `1.8.5+0.x64.linux.gnu` installed. Run:
juliaup update
in your terminal shell to install Julia 1.10.2+0.x64.linux.gnu and update the `release` channel to that version.
[oswald@stella01 BlackBoxOptim.jl]$ cat rosen.out
[oswald@stella01 BlackBoxOptim.jl]$ cat rosen.out
evaluation on worker
Starting optimization with optimizer XNESOpt{Float64, RandomBound{ContinuousRectSearchSpace}}
0.00 secs, 0 evals, 0 steps
sigma=1.0 |trace(ln_B)|=0.0
From worker 3: evaluation on worker
From worker 10: evaluation on worker
From worker 14: evaluation on worker
From worker 11: evaluation on worker
From worker 2: evaluation on worker
From worker 9: evaluation on worker
From worker 13: evaluation on worker
From worker 7: evaluation on worker
From worker 12: evaluation on worker
From worker 8: evaluation on worker
From worker 5: evaluation on worker
From worker 4: evaluation on worker
From worker 6: evaluation on worker
1.84 secs, 13 evals, 1 steps, fitness=576669.977892522
sigma=0.9999706009697786 |trace(ln_B)|=-5.854691731421724e-18
From worker 3: evaluation on worker
From worker 2: evaluation on worker
From worker 4: evaluation on worker
From worker 6: evaluation on worker
From worker 5: evaluation on worker
From worker 11: evaluation on worker
From worker 7: evaluation on worker
From worker 9: evaluation on worker
From worker 8: evaluation on worker
From worker 10: evaluation on worker
From worker 12: evaluation on worker
From worker 13: evaluation on worker
From worker 14: evaluation on worker
3.11 secs, 26 evals, 2 steps, fitness=576669.977892522
sigma=0.9996374110028836 |trace(ln_B)|=6.071532165918825e-18
From worker 2: evaluation on worker
From worker 5: evaluation on worker
From worker 3: evaluation on worker
From worker 4: evaluation on worker
From worker 6: evaluation on worker
From worker 7: evaluation on worker
From worker 8: evaluation on worker
From worker 9: evaluation on worker
From worker 10: evaluation on worker
From worker 11: evaluation on worker
From worker 12: evaluation on worker
From worker 13: evaluation on worker
From worker 14: evaluation on worker
4.32 secs, 39 evals, 3 steps, fitness=462136.298735016
sigma=0.9995845644566158 |trace(ln_B)|=1.5395670849294163e-17
From worker 2: evaluation on worker
From worker 3: evaluation on worker
From worker 4: evaluation on worker
From worker 6: evaluation on worker
From worker 5: evaluation on worker
From worker 7: evaluation on worker
From worker 11: evaluation on worker
From worker 8: evaluation on worker
From worker 9: evaluation on worker
From worker 10: evaluation on worker
From worker 12: evaluation on worker
From worker 13: evaluation on worker
From worker 14: evaluation on worker
Optimization stopped after 4 steps and 5.48 seconds
Termination reason: Max number of function evaluations (40) reached
Steps per second = 0.73
Function evals per second = 9.49
Improvements/step = NaN
Total function evaluations = 52
Best candidate found: [2.07323, -3.68329, 1.03099, -4.62028, 3.07159, 2.03989, -3.13307, -2.52309, 2.80827, 1.75317, 3.69806, 3.58054, -0.784089, -1.84108, 2.82848, 3.37218, -4.50288, -2.44541, 2.78737, 1.16678, -2.65538, 1.41732, 2.56764, -1.3611, -1.04387, 0.277611, -3.16395, -2.97312, 1.31683, -1.63275, 1.54157, 1.3299, 0.310397, 1.72371, -4.03617, 0.267421, -2.4304, -0.323388, -4.91322, 2.67858, -0.843714, -4.12674, 0.791436, 0.942429, -3.72488, 1.89244, -0.744755, 1.45, -1.49314, -2.10453]
Fitness: 381745.415040534
[oswald@stella01 BlackBoxOptim.jl]$ cat rosen.err
The latest version of Julia in the `release` channel is 1.10.2+0.x64.linux.gnu. You currently have `1.8.5+0.x64.linux.gnu` installed. Run:
juliaup update
in your terminal shell to install Julia 1.10.2+0.x64.linux.gnu and update the `release` channel to that version.
[oswald@stella01 BlackBoxOptim.jl]$
Metadata
Metadata
Assignees
Labels
No labels