Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] DeepSpeed Roadmap Q1 2025 #6946

Open
1 of 2 tasks
loadams opened this issue Jan 13, 2025 · 5 comments
Open
1 of 2 tasks

[Roadmap] DeepSpeed Roadmap Q1 2025 #6946

loadams opened this issue Jan 13, 2025 · 5 comments
Labels
roadmap Roadmap direction for DeepSpeed

Comments

@loadams
Copy link
Collaborator

loadams commented Jan 13, 2025

This is a living document! For each item here, we intend to link the PR/issue for discussion.

This is DeepSpeed's first attempt at a public roadmap and will be updated with additional details.

@loadams loadams added the roadmap Roadmap direction for DeepSpeed label Jan 13, 2025
@loadams loadams pinned this issue Jan 16, 2025
@zhaoyang-star
Copy link

Will Multi-Token Prediction mentioned in Deepseek V3 be added to the roadmap Q1?

@shiyongde
Copy link

need FP8 training deepseek-MOE

@hijeffwu
Copy link

Plug-in support for the different accelerators

@loadams
Copy link
Collaborator Author

loadams commented Mar 12, 2025

@hijeffwu - could you clarify more on what you're requesting? Different accelerators are already supported in DeepSpeed.

@hijeffwu
Copy link

@hijeffwu - could you clarify more on what you're requesting? Different accelerators are already supported in DeepSpeed.

My idea is as follows:

The current process for adding support for a new accelerator card involves creating a new xxx_accelerator.py file in the accelerators directory and adding a product-specific directory under DeepSpeed/op_builder to adapt kernels for different chips. However, this architecture lacks a unified backend for the different chip kernel code.

Since the primary difference in AI chip vendors' support for DeepSpeed lies in kernel implementations, would it be possible to use "deepspeed-kernels" as the unified kernel code backend for DeepSpeed, while retaining only Python code in the main DeepSpeed repository? This approach could be like Megatron-LM + Apex + TransformerEngine, thereby making DeepSpeed more adaptable to diverse AI chip backends.

Key points in this proposal:

  1. Vendor Flexibility: Chip manufacturers could contribute optimized kernels to deepspeed-kernels without modifying core DeepSpeed code.
  2. Maintainability: Simplifies codebase management by isolating low-level optimizations.
  3. Cross-Platform Compatibility: Similar to how TransformerEngine abstracts NVIDIA-specific optimizations.

This architecture aligns with observed practices in adapting DeepSpeed to non-NVIDIA hardware .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Roadmap direction for DeepSpeed
Projects
None yet
Development

No branches or pull requests

5 participants
@shiyongde @zhaoyang-star @hijeffwu @loadams and others