Using Pytorch image produces an argparse bug #2819
Unanswered
VictorJouault
asked this question in
Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am trying to run a job on Sagemaker that requires both MXNet (because I am using GluonTS, which requires MXNet) and Pytorch. However, no matter the deep learning container image I select (from here), the job fails and I cannot manage to solve it.
In particular, when using a Pytorch container (either
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.10.0-gpu-py38-cu113-ubuntu20.04-sagemaker
or"763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.9.1-gpu-py38-cu111-ubuntu20.04"
), the following message is printed at the beginning of the script:This seems to be causing a problem with the argparser when calling the experiment script. Even though the function call worked with another image, it doesn't work anymore and produces the error message below.
However, I also am not able to use the MXNet image because of a Horovod bug ..
Any idea is appreciated, thanks!
Beta Was this translation helpful? Give feedback.
All reactions