Skip to content

[26.04] AutoModel Roadmap #1438

@akoumpa

Description

@akoumpa

🗺️ [26.04] Automodel Roadmap

This issue tracks the planned work items for Automodel. We'd love to hear from you — if you have feature requests, suggestions, or want to upvote an item, please comment below!


Core Infrastructure

  • CLI Application for Job Execution
    Introduce a CLI tool to simplify launching and managing training/inference jobs, reducing boilerplate and configuration overhead.

  • Checkpointing Robustness & Speed
    Improve checkpoint save/restore reliability and performance. Faster loads/saves, better tracking of key metrics in test suite, reduced memory overhead for state adapters, and reduced storage overhead.


Model Zoo & Registry

  • Day-0 Model Zoo
    Continue supporting day-0 model releases with optimizations.

  • Model Capability Registry (with Testing & Docs Generation)
    Build a registry that maps each model to its supported capabilities (e.g., parallelism strategies, precision modes, sequence lengths) and automatically generates test coverage and documentation from it.


MoE (Mixture of Experts)

  • Lower Precision MoE (FP8 + NVFP4)
    Enable MoE training and inference at FP8 and NVFP4 precision for improved throughput and reduced memory footprint.

  • Hybrid Expert Parallelism (Hybrid-EP)
    Support hybrid expert parallelism strategies that combine EP with TP/PP for more flexible and efficient MoE scaling.

  • CUDAGraph + MoE
    Enable CUDAGraph capture for MoE models to reduce kernel launch overhead and improve end-to-end throughput.

  • PeFT MoE with TransformerEngine Experts
    Integrate TransformerEngine as the expert implementation backend for MoE.


Vision-Language Models (VLM)

  • VLM Refactor
    Refactor the VLM architecture for cleaner abstractions, better modularity, and easier extension to new vision-language model families.

  • Packed Sequence Support for VLM
    Add packed/variable-length sequence support for VLM training to improve GPU utilization when inputs have heterogeneous lengths.


💬 We Want Your Input!

Have a feature request or use case that isn't covered above? Please comment on this issue and let us know:

  1. What you'd like to see
  2. Why it matters for your workflow
  3. Any context (model family, scale, hardware, etc.)

We'll prioritize based on community feedback and engineering feasibility.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions