Skip to content

fix: enable model analyzer to resolve DeepSeek-V4 and Qwen3.5 models#1130

Open
kwohting wants to merge 2 commits into
alibaba:mainfrom
kwohting:fix/model-analyzer-new-models
Open

fix: enable model analyzer to resolve DeepSeek-V4 and Qwen3.5 models#1130
kwohting wants to merge 2 commits into
alibaba:mainfrom
kwohting:fix/model-analyzer-new-models

Conversation

@kwohting

Copy link
Copy Markdown
  • Add support_model_types param to register_model() for co-located model_type registration
  • Register DeepseekV4ForCausalLM under deepseek3 (same DeepSeekV2 class)
  • Register both ForCausalLM and ForConditionalGeneration variants for Qwen3.5
  • Add fuzzy architecture matching (suffix swap) as fallback in get_ft_model_type_by_config
  • Add model_type field fallback for unrecognized architectures
  • Fix dtype-to-bytes mapping in hf_model_helper (wrong keys FP32/INT8 → correct F32/I8)
  • Handle nested text_config for hidden_size in multimodal model configs

@kwohting kwohting requested a review from LLLLKKKK as a code owner June 22, 2026 10:29
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


guoding.lgd seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

…3.5 models

- Fix dtype-to-bytes mapping (wrong keys FP32/INT8 → correct HF keys F32/I8)
- Add _resolve_ft_model_type with three-level fallback:
  1. ModelDict direct lookup (existing behavior)
  2. Suffix swap: ForCausalLM ↔ ForConditionalGeneration (endswith-based)
  3. model_type field mapping (deepseek_v4→deepseek3, qwen3_5_moe→qwen35_moe, etc.)
- Handle nested text_config for hidden_size in multimodal model configs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kwohting kwohting force-pushed the fix/model-analyzer-new-models branch from 101f380 to 55c4dc3 Compare June 22, 2026 10:34
@LLLLKKKK

Copy link
Copy Markdown
Collaborator

AI Code Review - PR #1130

Status: LGTM

Summary: P0/0 · P1/0 · P2/2 · P3/0

lgtm ready to ci

Non-blocking Suggestions

P2

  • 新增三级 model type 解析逻辑缺少单元测试 @ rtp_llm/tools/api/hf_model_helper.py:179
    • 建议:为 _resolve_ft_model_type 添加单元测试,覆盖:(1) ModelDict 直接命中;(2) fuzzy arch 命中(ForCausalLM↔ForConditionalGeneration);(3) model_type map 命中(deepseek_v4→deepseek3、qwen3_5→qwen35_dense);(4) 三级全 miss 返回 None。同时为 _DTYPE_BYTES 验证 F32→4 bytes、I8→1 byte、未知类型→2 bytes。
  • 跨模块访问私有 classmethod _resolve_ft_model_type @ rtp_llm/tools/api/model_basic_info_analyzer.py:31
    • 建议:将 _resolve_ft_model_type 重命名为 resolve_ft_model_type(去掉下划线前缀),或提供一个公开的 classmethod/staticmethod 包装。

Checklist Violations (4 fail / 56 total)

General Principles Checklist

  • [6.1] Architecture — 分层边界:新概念在正确层级,不泄漏内部 → issue 跨模块访问私有 classmethod _resolve_ft_model_type
    _model_basic_info_analyzer.py 调用 HfStyleModelInfo.resolve_ft_model_type(下划线前缀方法),跨模块访问内部方法。低影响但违反封装约定。
  • [6.1] Tests — 新逻辑有聚焦单测 + 相关集成/smoke 测试 → issue 新增三级 model type 解析逻辑缺少单元测试
    _新增 _resolve_ft_model_type(三级 fallback)、_fuzzy_match_architecture(后缀交换)、DTYPE_BYTES(14 种 dtype 映射)均无测试。rtp_llm/tools/api/ 目录下无已有测试文件。
  • [6.1] Tests — 边界 case 覆盖(空、单元素、最大值) → issue 新增三级 model type 解析逻辑缺少单元测试
    __fuzzy_match_architecture 的空 architectures 列表分支、DTYPE_BYTES.get 的 unknown dtype 默认值分支均无测试验证。

Python Static-First Checklist

  • [P.A] 静态结构与类型纪律 — 禁止 hasattr 做控制流分支 → checklist-only
    model_basic_info_analyzer.py:118-124 使用 hasattr(config, 'hidden_size') 和 hasattr(config, 'text_config') 做控制流。但这是 HF transformers PretrainedConfig 标准用法,Config 字段因模型而异无法预知,属于合理例外。

Strengths

  • _DTYPE_BYTES 修复了旧代码中 FP32/INT8 等 dtype 名称与 safetensors 标准不一致、FP32 被计为 2 字节等多个计算错误,用字典替代硬编码条件,覆盖了 F8/F64/BOOL 等新 dtype
  • 三级 fallback 设计(ModelDict → fuzzy arch → model_type map)保持了对已有模型的兼容性,仅在前级 miss 时触发后级,不会影响已有解析路径
  • text_config fallback 同时处理 dict 和 PretrainedConfig 对象两种形态,对多模态模型(如 VL 系列)的 hidden_size 解析更健壮

…l_type public

- Rename _resolve_ft_model_type → resolve_ft_model_type (public API)
- Add 18 unit tests covering:
  - Direct ModelDict match
  - Fuzzy arch match (ForCausalLM ↔ ForConditionalGeneration)
  - model_type map fallback (deepseek_v4, qwen3_5, etc.)
  - All-miss returns None
  - Edge cases (empty architectures, no suffix match, NextN not fuzzy-matched)
  - _DTYPE_BYTES correctness (F32→4, BF16→2, I8→1, unknown→2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@LLLLKKKK

Copy link
Copy Markdown
Collaborator

AI Code Review - PR #1130

Status: BLOCKING

Summary: P0/0 · P1/1 · P2/2 · P3/1

Blocking Issues

P1

  • 新增测试文件未注册到 Bazel BUILD,CI 不会自动运行 @ rtp_llm/test/hf_model_helper_test.py:1
    • 建议:在 rtp_llm/test/BUILD 中添加 py_test target。该测试通过 importlib 绕开 .so 依赖且 mock 了 huggingface_hub,不需要 GPU,可设置 exec_properties = {} 或 tags = ['no_gpu']。需要将 rtp_llm/model_factory_register.py 和 rtp_llm/tools/api/hf_model_helper.py 加入 data/deps。

Non-blocking Suggestions

P2

  • _MODEL_TYPE_MAP 维护性风险:新增模型需同步更新两处映射 @ rtp_llm/tools/api/hf_model_helper.py:152
    • 建议:添加简短注释说明此映射的维护要求,或在 resolve_ft_model_type docstring 中说明 fallback 策略。长期可考虑将 model_type 映射纳入 register_model() 注册流程。
  • 测试使用相对路径加载模块,Bazel sandbox 下会失败 @ rtp_llm/test/hf_model_helper_test.py:8
    • 建议:使用 os.path.join(os.path.dirname(file), '..', 'model_factory_register.py') 或通过 Bazel runfiles API 定位文件路径,避免 CWD 依赖。

P3

  • 未知 dtype 默认字节数从 1 变为 2,行为变更未说明 @ rtp_llm/tools/api/hf_model_helper.py:128
    • 建议:确认 HuggingFace safetensors 实际使用的 dtype key 集合是否已被 _DTYPE_BYTES 完全覆盖。如有遗漏(如 'FP16'、'INT8' 等非标准 key),补充到字典中以避免走默认值。

Checklist Violations (3 fail / 56 total)

General Principles Checklist

  • [6.1] Tests — 新逻辑有聚焦单测 + 相关集成/smoke 测试 → issue 新增测试文件未注册到 Bazel BUILD,CI 不会自动运行
    _新增 hf_model_helper_test.py 覆盖 resolve_ft_model_type、_fuzzy_match_architecture、DTYPE_BYTES,测试质量好。但未注册到 rtp_llm/test/BUILD,CI 不会执行。

Python Static-First Checklist

  • [P.A] 静态结构与类型纪律 — 禁止 hasattr 做控制流分支 → checklist-only
    model_basic_info_analyzer.py:118-124 使用 hasattr 检查 hidden_size 和 text_config。config 来自 transformers.AutoConfig.from_pretrained,属性集取决于具体模型 class,hasattr 是此场景标准做法,不升级为 issue。
  • [P.G] 测试规范 — 数据驱动测试用 pytest.mark.parametrize → checklist-only
    hf_model_helper_test.py 使用 unittest.TestCase 风格,多个 test_model_type_fallback* 和 test_fuzzy_match_* 方法可合并为 parametrize 形式。但 unittest 风格在本项目中常见且测试已覆盖充分,不升级。_

Strengths

  • _DTYPE_BYTES 修复了旧代码中 FP32 被错误按 2 字节、INT8 被错误按 2 字节计算的 bug,dtype 覆盖全面且使用 HF safetensors 标准命名
  • resolve_ft_model_type 三级 fallback(精确匹配 → suffix 模糊匹配 → model_type 映射)设计合理,向前兼容新模型变体(DeepSeek V4、Qwen 3.5),正确排除 NextN 后缀避免 MTP draft 模型误匹配
  • 测试覆盖全面,包含正向匹配、模糊匹配双向、model_type fallback、全部未命中、空 architectures、无 architectures key 等边界 case
  • text_config fallback 同时处理 dict 和 PretrainedConfig 对象两种形态,正确支持多模态模型的 hidden_size 解析

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants