-
Notifications
You must be signed in to change notification settings - Fork 140
Open
Description
📚 Documentation
Link
https://pytorch.org/torchx/latest/components/distributed.html
What does it currently say?
Not clear whether --cpu, --gpu arguments are overrided by -j arguments, although in my testing (launch then run top, etc.) it seems they are?
What should it say?
Both the docs and the --help output for dist.ddp could be more clear on this front. More generally, I am wondering if there exists a torchx equivalent of torchrun --standalone --nnodes=1 --nproc_per_node=auto ...
.
Why?
Clearly I wouldn't want --gpu=0
with -j 1x2
, right? As such the listed defaults in docs --help are a little confusing.
Metadata
Metadata
Assignees
Labels
No labels