AUTOMATIC MIXED PRECISION #33

zhaowenZhou · 2023-05-25T10:46:11Z

Has anyone tried torch.cuda.amp?
Seems that ms_attention doesn't support fp16 even after I modified ms_deform_attn_forward_cuda
Any other way to implement amp? Or is there any ways to reduce the GPU memory? I got cuda OOM for bs=4 every time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AUTOMATIC MIXED PRECISION #33

AUTOMATIC MIXED PRECISION #33

zhaowenZhou commented May 25, 2023

AUTOMATIC MIXED PRECISION #33

AUTOMATIC MIXED PRECISION #33

Comments

zhaowenZhou commented May 25, 2023