Skip to content

feat(qwen35): derive scalars from weights, assert vs GGUF metadata#359

Merged
davide221 merged 1 commit into
Luce-Org:mainfrom
dusterbloom:split/04-gguf-scalar-assert
Jun 10, 2026
Merged

feat(qwen35): derive scalars from weights, assert vs GGUF metadata#359
davide221 merged 1 commit into
Luce-Org:mainfrom
dusterbloom:split/04-gguf-scalar-assert

Conversation

@dusterbloom

Copy link
Copy Markdown
Collaborator

Re-carved from #274 (commit 5819648), DRY'd into a shared helper + the unit test the original lacked.

After loading weights, the qwen35 target loader and the dflash draft loader derive head_dim/n_head/n_head_kv from the actual weight-tensor shapes and assert against the GGUF-declared hparams; on mismatch → set_last_error + return false at load time, making the "stale scalar at graph-build time" bug class structurally impossible. Load-time only, no runtime cost; well-formed GGUFs pass through unchanged.

DRY: pure verify_derived_scalars() in server/src/common/derived_scalars.h, unit-tested (13 cases). The qwen35 target Q-projection packs Q‖gate (ne[1] = n_head·n_embd_head_k·2, per the loader's own contract); the draft loader uses the standard n_head·head_dim. gemma4 has an equivalent inline check on a different (unpark) path with different semantics — left as-is, noted in the header.

Validation: helper unit-tested (13 cases); both modified loader TUs compile clean against the new call sites. Real-GGUF end-to-end load not yet exercised — this is a defensive load-time check that only fires on a genuine weight-shape↔metadata mismatch.

5 files, +266.

Load-time guard: after loading wq/wk, derive head_dim/n_head/n_head_kv from
tensor shapes and assert against GGUF-declared values; set_last_error+return
false on mismatch. Makes the stale-scalar-at-graph-build bug class impossible.

DRY: extracted verify_derived_scalars() pure helper into
server/src/common/derived_scalars.h (no IO, header-only); wired at both new
sites (draft loader layer 0, qwen35 target first full-attn layer). gemma4
inline block is a silent override not an assert; left as-is with comment.

Unit test: server/test/test_derived_scalars.cpp — 13 assertions, 0 failures.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed

Re-trigger cubic

@davide221 davide221 merged commit 3e12dc9 into Luce-Org:main Jun 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants