Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST] Support Offload deepspeed engine in RLHF training #7013

Open
hijkzzz opened this issue Feb 7, 2025 · 3 comments
Open

[REQUEST] Support Offload deepspeed engine in RLHF training #7013

hijkzzz opened this issue Feb 7, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@hijkzzz
Copy link

hijkzzz commented Feb 7, 2025

See vLLM sleep mode: vllm-project/vllm#11743
This feature is very important in RLHF
such as the hybrid engine mode in OpenRLHF: OpenRLHF/OpenRLHF@c10f02b

@hijkzzz hijkzzz added the enhancement New feature or request label Feb 7, 2025
@tjruwase
Copy link
Contributor

tjruwase commented Feb 7, 2025

@hijkzzz can you be more specific on this request. DeepSpeed is probably the first framework to offer offloading in RLHF training:
https://github.com/deepspeedai/DeepSpeedExamples/blob/8075143d922e0a25c8217ed4f72ef7121cad423a/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py#L237-L251

@hijkzzz
Copy link
Author

hijkzzz commented Feb 7, 2025

@hijkzzz can you be more specific on this request. DeepSpeed is probably the first framework to offer offloading in RLHF training: https://github.com/deepspeedai/DeepSpeedExamples/blob/8075143d922e0a25c8217ed4f72ef7121cad423a/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py#L237-L251

RLHF training needs:
offload model/optimizer to CPU when using vLLM samples generation for larger inference batch size
load model/optimizer back to GPU during training to avoid PCI-E data exchange
same as vllm-project/vllm#11743

@tjruwase
Copy link
Contributor

tjruwase commented Feb 7, 2025

@hijkzzz, it seems you are requesting on-demand and fine-grained offloading, is that correct? Would the following APIs work:
https://deepspeed.readthedocs.io/en/latest/zero3.html#offload-states

@tohtana, FYI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants