RuntimeError: Expected to mark a variable ready only once - with .backward() in validation_step #13195
Unanswered
kampelmuehler
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
I'm getting
RuntimeError: Expected to mark a variable ready only once
when running a model on more than one GPU (this is my first time using lightning with DDP).In my validation_step I need to loop through the model multiple times and run .backward() to get gradients - and this is what causes the error. The code looks something like this:
for this purpose I also set
It's exactly at
pred.sum().backward()
where the error is triggered.I've tried
find_unused_parameters=False
and a static graph is impossible - so the options the error message gives are exhausted. As mentioned earlier it works fine on a single GPU.Any hints on what causes that behavior and how it can be fixed?
I would also be fine with running validation on a single GPU, but I found that being impossible within lightning.
Beta Was this translation helpful? Give feedback.
All reactions