-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use multi-thread for cloud experiments and mutli-process for local ones #29
base: main
Are you sure you want to change the base?
Conversation
run_all_experiments.py
Outdated
elif args.cloud_experiment_name: | ||
# Use multi-threads for cloud experiments, because each thread only needs to | ||
# wait for cloud build results or conduct simple I/O tasks. | ||
with ThreadPoolExecutor(max_workers=NUM_EXP) as executor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we can consolidate the two code paths more?
e.g. something like:
if ...:
pool = ThreadPool
else:
pool = Pool
Not sure what the difference is exactly between ThreadPoolExecutor
and ThreadPool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.ThreadPool
I reckon the doc implies ThreadPoolExecutor
is more modern and supports thread-level parallelism better?
I can definitely try to consolidate the two code parts to make them look better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made an attempt to clean up the related code in run_all_experiments.py
a bit, will do the same in run_one_experiment.py
.
The code should be more readable and have less repetition, but I did not find a perfect way to consolidate them further without over-engineering this.
Please let me know if there are other options.
Can we use https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor instead to make the code paths more consistent for the process case? |
Yep, sure. |
Not high priority, converting it to a draft for now and will come back to this later. |
With clould experiments enabled, local program is left with no computationally heavy tasks. Most of the time, it only waits for cloud build results and writes them to a file.
using multi-thread can achieve higher parallelism (e.g., avoid overhead in creation, management, context switching) and require less resource (e.g., reduce memory usage).