Add Qwen 3.5 TB5 TP proof suite by jjang-ai · Pull Request #42 · osaurus-ai/vmlx-swift

jjang-ai · 2026-06-11T06:39:09Z

Summary

add a standalone Qwen 3.5 TP proof runner for local kernel smoke and model/cache decode rows
add a conservative Qwen-family sharding plan for attention plus dense/shared projection modules
add a TB5/RDMA proof script and docs covering peer smoke, MLX collective smoke, model load/decode, prefix cache, disk L2, and TurboQuant KV evidence

DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer QWEN35_TP_ARTIFACT_DIR=/tmp/vmlx-qwen35-tb5-tp-clean-smoke-20260610-233638 scripts/vmlx-qwen35-tb5-tp-proof.sh
artifact: /tmp/vmlx-qwen35-tb5-tp-clean-smoke-20260610-233638/SUMMARY.json
result: PARTIAL_NO_MODEL, metal_status=ok, collective_status=ok
dirty-worktree earlier source gate: DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer swift test --filter TensorDataPlanePolicyTests passed 6/6

no exact local Qwen 3.5 model bundle was found, so model load/decode/cache proof remains blocked until QWEN35_TP_MODEL is set
local smoke is single-rank and does not prove real TB5 RDMA; real row still requires two Thunderbolt peers, JACCL/RDMA availability, and matching model bundles
Qwen 3.5 GatedDelta/SSM and routed SwitchGLU experts are replicated pending companion-cache and expert-sharding parity proof

Add Qwen35 TB5 TP proof suite

6e48011