Skip to content

Qwen3.5 MoE: MLP experts silently skipped during abliteration (attention-only, no warning) #339

@rocker-zhang

Description

@rocker-zhang

Running Heretic v1.3.0 on a Qwen3.5-122B-A10B model, the abliterable-components summary lists only attn.o_proj (48 modules); the MoE expert MLPs are never ablated, with no indication that they were skipped.

The cause is in get_layer_modules (src/heretic/model.py:383-385): it iterates layer.mlp.experts, but in transformers 5.6 the Qwen3.5 MoE experts are a single fused nn.Parameter (Qwen3_5MoeExperts.down_proj), not an iterable of per-expert modules. for expert in layer.mlp.experts raises TypeError, which suppress(Exception) swallows, so zero MLP modules are matched and the run proceeds attention-only.

I saw from #206/#207 that #187 gives basic (attention-only) Qwen3.5 support and that fused-expert ablation should avoid direct weight modification, with ARA (#211) in progress. Since #202 notes MLP ablation is sometimes needed to bring refusals down, a user on these models currently gets no signal that the MLP path was skipped.

Would a diagnostic-only warning be welcome in the meantime (no weight modification): when a layer exposes an experts container but zero MLP modules are matched, print a one-time "ablation will be attention-only" warning? Or will ARA make this moot soon enough that it is not worth adding? I can send a PR for the warning if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions