[26.04] AutoModel Roadmap

# 🗺️  [26.04] Automodel Roadmap

> This issue tracks the planned work items for Automodel. We'd love to hear from you — if you have feature requests, suggestions, or want to upvote an item, please comment below!

---

## Core Infrastructure

- [ ] **CLI Application for Job Execution**
  Introduce a CLI tool to simplify launching and managing training/inference jobs, reducing boilerplate and configuration overhead.

- [ ] **Checkpointing Robustness & Speed**
  Improve checkpoint save/restore reliability and performance. Faster loads/saves, better tracking of key metrics in test suite, reduced memory overhead for state adapters, and reduced storage overhead.

---

## Model Zoo & Registry

- [ ] **Day-0 Model Zoo**
  Continue supporting day-0 model releases with optimizations.

- [ ] **Model Capability Registry (with Testing & Docs Generation)**
  Build a registry that maps each model to its supported capabilities (e.g., parallelism strategies, precision modes, sequence lengths) and automatically generates test coverage and documentation from it.

---

## MoE (Mixture of Experts)

- [ ] **Lower Precision MoE (FP8 + NVFP4)**
  Enable MoE training and inference at FP8 and NVFP4 precision for improved throughput and reduced memory footprint.

- [ ] **Hybrid Expert Parallelism (Hybrid-EP)**
  Support hybrid expert parallelism strategies that combine EP with TP/PP for more flexible and efficient MoE scaling.

- [ ] **CUDAGraph + MoE**
  Enable CUDAGraph capture for MoE models to reduce kernel launch overhead and improve end-to-end throughput.

- [ ] **PeFT MoE with TransformerEngine Experts**
  Integrate TransformerEngine as the expert implementation backend for MoE.

---

## Vision-Language Models (VLM)

- [ ] **VLM Refactor**
  Refactor the VLM architecture for cleaner abstractions, better modularity, and easier extension to new vision-language model families.

- [ ] **Packed Sequence Support for VLM**
  Add packed/variable-length sequence support for VLM training to improve GPU utilization when inputs have heterogeneous lengths.

---

## 💬 We Want Your Input!

Have a feature request or use case that isn't covered above? Please comment on this issue and let us know:

1. **What** you'd like to see
2. **Why** it matters for your workflow
3. **Any context** (model family, scale, hardware, etc.)

We'll prioritize based on community feedback and engineering feasibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[26.04] AutoModel Roadmap #1438

🗺️ [26.04] Automodel Roadmap

Core Infrastructure

Model Zoo & Registry

MoE (Mixture of Experts)

Vision-Language Models (VLM)

💬 We Want Your Input!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[26.04] AutoModel Roadmap #1438

Description

🗺️ [26.04] Automodel Roadmap

Core Infrastructure

Model Zoo & Registry

MoE (Mixture of Experts)

Vision-Language Models (VLM)

💬 We Want Your Input!

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions