Skip to content

Add Qwen 3.5 TB5 TP proof suite#42

Open
jjang-ai wants to merge 1 commit into
mainfrom
qwen35-tb5-tp-proof-clean
Open

Add Qwen 3.5 TB5 TP proof suite#42
jjang-ai wants to merge 1 commit into
mainfrom
qwen35-tb5-tp-proof-clean

Conversation

@jjang-ai

Copy link
Copy Markdown
Contributor

Summary

  • add a standalone Qwen 3.5 TP proof runner for local kernel smoke and model/cache decode rows
  • add a conservative Qwen-family sharding plan for attention plus dense/shared projection modules
  • add a TB5/RDMA proof script and docs covering peer smoke, MLX collective smoke, model load/decode, prefix cache, disk L2, and TurboQuant KV evidence

Verification

  • DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer QWEN35_TP_ARTIFACT_DIR=/tmp/vmlx-qwen35-tb5-tp-clean-smoke-20260610-233638 scripts/vmlx-qwen35-tb5-tp-proof.sh
  • artifact: /tmp/vmlx-qwen35-tb5-tp-clean-smoke-20260610-233638/SUMMARY.json
  • result: PARTIAL_NO_MODEL, metal_status=ok, collective_status=ok
  • dirty-worktree earlier source gate: DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer swift test --filter TensorDataPlanePolicyTests passed 6/6

Boundaries

  • no exact local Qwen 3.5 model bundle was found, so model load/decode/cache proof remains blocked until QWEN35_TP_MODEL is set
  • local smoke is single-rank and does not prove real TB5 RDMA; real row still requires two Thunderbolt peers, JACCL/RDMA availability, and matching model bundles
  • Qwen 3.5 GatedDelta/SSM and routed SwitchGLU experts are replicated pending companion-cache and expert-sharding parity proof

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant