You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RLHF training needs:
offload model/optimizer to CPU when using vLLM samples generation for larger inference batch size
load model/optimizer back to GPU during training to avoid PCI-E data exchange
same as vllm-project/vllm#11743
See vLLM sleep mode: vllm-project/vllm#11743
This feature is very important in RLHF
such as the hybrid engine mode in OpenRLHF: OpenRLHF/OpenRLHF@c10f02b
The text was updated successfully, but these errors were encountered: