Correct approach to calculate metrics in DDP setting #12602
Replies: 1 comment 7 replies
-
if your batch size is
it will be (2, N, ...). |
Beta Was this translation helpful? Give feedback.
-
if your batch size is
it will be (2, N, ...). |
Beta Was this translation helpful? Give feedback.
-
In the case of DDP:
validation_step
or the metrics should be calculated atvalidation_step_end
after gathering output tensors returned byvalidation_step
?validation_step
, would be it correct to take the mean of the corresponding metrics invalidation_step_end
? Considering batch partitions for each device can be uneven?all_gather
on the output tensors insidevalidation_step_end
adds an extra dimension before the batch dimension? For example, if my original batch tensor is of the shapeN x C x H x W
and 2 GPUs are in use then afterall_gather
the tensor will be of the shape2 x M x C x H x W
(where2M = N
)? What happens if the batch size (N
) is an odd number?Beta Was this translation helpful? Give feedback.
All reactions