[Bug] With MultiSyncDataCollector
, tensors
cannot be instantiated on CUDA in child processes.
#2235
Open
3 tasks done
Labels
bug
Something isn't working
Describe the bug
Despite applying the appropriate guards (
mp.set_start_method('spawn')
,if __name__ == "__main__"
), usingMultiSyncDataCollector
with the
cuda
device causes program to freeze.To Reproduce
Execution output:
Terminating the program gives this traceback:
Expected behavior
After printing "As you can see, spawning a single environment on the main process is absolutely unproblematic.", program progresses into the collector iterable and prints "Hey Hey!!! :D" repeatedly.
System info
Describe the characteristic of your environment:
Additional context
Problem was encountered as part of an effort to spawn multiple environments on the GPU. Any pointers in this direction greatly appreciated.
Proof of issue with tensors
By adding a killswitch into
env_fn
in various positions, we can make the following observations:Code (No tensor defined yet)
Result: Program crashes as expected when hitting a
breakpoint
with child process.Code: Insert CUDA tensor declaration in killswitch clause
Result: Program hangs indefinitely.
PS
Since error relates to tensors, would it be a good idea to rope in
PyTorch
devs?Checklist
The text was updated successfully, but these errors were encountered: