Skip to content

feat: Add GLM 5 implementation#1372

Open
hemildesai wants to merge 3 commits intomainfrom
hemil/glm-5
Open

feat: Add GLM 5 implementation#1372
hemildesai wants to merge 3 commits intomainfrom
hemil/glm-5

Conversation

@hemildesai
Copy link
Contributor

Summary

  • Add nemo_automodel/components/models/glm_moe_dsa/ for the GLM-MoE-DSA architecture (zai-org/GLM-5)
  • Reuses DeepseekV32MLA (MLA + DSA Indexer) from deepseek_v32/layers.py and follows the glm4_moe_lite pattern for Block/Model/ForCausalLM
  • Extends Glm4MoeStateDictAdapter with indexer-specific non-quantized key handling
  • Bumps transformers dependency to >=5.2.0 for GlmMoeDsaConfig support

Test plan

  • All imports verified (GlmMoeDsaForCausalLM, GlmMoeDsaStateDictAdapter)
  • Ruff lint and format pass
  • Existing unit tests pass (2964 passed, 16 pre-existing failures unrelated to this PR)

🤖 Generated with Claude Code

@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 24, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

hemildesai and others added 2 commits February 26, 2026 18:35
Add nemo_automodel/components/models/glm_moe_dsa/ with support for
the GLM-MoE-DSA architecture (zai-org/GLM-5). Reuses DeepseekV32MLA
for MLA+DSA indexer attention and follows the glm4_moe_lite pattern
for Block/Model/ForCausalLM structure. Bumps transformers to >=5.2.0
for GlmMoeDsaConfig support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
…pter

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant