Skip to content

fix: Enable RTR rewrite autoconf for QNN 6D Reshape models (Swin, Nougat)#336

Merged
ssss141414 merged 5 commits into
mainfrom
fix/qnn-6d-reshape-autoconf
Apr 20, 2026
Merged

fix: Enable RTR rewrite autoconf for QNN 6D Reshape models (Swin, Nougat)#336
ssss141414 merged 5 commits into
mainfrom
fix/qnn-6d-reshape-autoconf

Conversation

@ssss141414

Copy link
Copy Markdown
Contributor

Problem

Several models fail on QNN EP (NPU) because Swin's window partition creates 6D Reshape operations (e.g. [1, 8, 7, 8, 7, 192]), exceeding QNN's max tensor rank of 5D.

The codebase already has a ReshapeTransposeReshapeOverlyHighDimPattern rewrite that merges consecutive dimensions to reduce 6D→≤5D, but it was never auto-triggered due to two bugs in the autoconf pipeline.

Affected Models

  • microsoft/swin-large-patch4-window7-224 (image-classification)
  • facebook/nougat-base (image-to-text, vision-encoder-decoder)
  • naver-clova-ix/donut-base (image-to-text, vision-encoder-decoder)
  • PekingU/rtdetr_r50vd_coco_o365 (object-detection) — already worked, no 6D Reshapes

Root Cause

Bug 1: pattern_id mismatch

ReshapeTransposeReshapeOverlyHighDimPattern and LowDimPattern both inherit pattern_id from a shared PatternSchema (name="ReshapeTransposeReshapePattern"), producing SUBGRAPH/ReshapeTransposeReshapePattern instead of the class-specific IDs expected by information rules (e.g. SUBGRAPH/ReshapeTransposeReshapeOverlyHighDimPattern).

Bug 2: kebab/snake case key mismatch

The information rule emits "highdimRTR-lowdimRTR": true (kebab-case), but RewritePipe.build_config() looks up kwargs.get("highdimRTR_lowdimRTR") (snake_case via cap.python_name).

Fix

  1. Override pattern_id in both OverlyHighDimPattern and LowDimPattern to return their class-specific IDs (uses the base class's designed extension point).

  2. Normalize keys in AnalysisResult.get_optimization_config() with key.replace("-", "_") — no-op for existing snake_case keys.

Testing

Unit Tests

  • 3821 passed, 73 skipped (hardware-gated), 0 failures

Perf Verification (winml perf on Snapdragon X Elite NPU)

Model Task Before After Latency
PekingU/rtdetr_r50vd_coco_o365 object-detection ✅ NPU ✅ NPU 303ms
microsoft/swin-large-patch4-window7-224 image-classification ComposeGraph failed NPU 141ms
facebook/nougat-base image-to-text ComposeGraph failed NPU 459ms
naver-clova-ix/donut-base image-to-text ComposeGraph failed ⚠️ FinalizeGraphs 6020 Input 2560×1920 exceeds QNN DDR spill limits

Fix two bugs preventing the ReshapeTransposeReshapeOverlyHighDimPattern
rewrite from being auto-triggered during the optimize-analyze loop:

1. pattern_id mismatch: ReshapeTransposeReshapeOverlyHighDimPattern and
   LowDimPattern both inherited pattern_id from shared schema, producing
   'SUBGRAPH/ReshapeTransposeReshapePattern' instead of the class-specific
   IDs expected by information rules. Added property overrides.

2. kebab/snake case mismatch: information rule emits 'highdimRTR-lowdimRTR'
   (kebab-case) but RewritePipe.build_config looks up python_name
   'highdimRTR_lowdimRTR' (snake_case). Added key.replace('-', '_')
   normalization in AnalysisResult.get_optimization_config().

Models affected: Swin, Donut, Nougat (vision-encoder-decoder family)
Root cause: Swin window partition creates 6D Reshape ops (e.g. [1,8,7,8,7,192])
which exceed QNN EP max tensor rank of 5D.
@ssss141414 ssss141414 requested a review from a team as a code owner April 14, 2026 05:00
Comment thread src/winml/modelkit/pattern/transpose_patterns.py Outdated
)
assert len(htp_patterns) == 1, f"Expected 1 HTP pattern, got {len(htp_patterns)}"

# Check pattern IDs (note: multiple Pattern classes can share same pattern_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "multiple Pattern classes can share same pattern_id" mean? I am unfaimilar with this..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From copilot:

Multiple Pattern classes can share the same pattern_id when they are semantically equivalent variants of the same subgraph. For example, Gelu1Pattern, Gelu2Pattern, Gelu3Pattern, and Gelu4Pattern all return "SUBGRAPH/GeluPattern"

Copilot AI requested a review from xieofxie April 14, 2026 05:14

if action_item.optimization_options:
optim_options.update(action_item.optimization_options)
# Normalize kebab-case keys to snake_case (python_name)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to update the original information config? Then we could get rid of this transform

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add both config and here? Adding here is to prevent wrong config...

@ssss141414 ssss141414 force-pushed the fix/qnn-6d-reshape-autoconf branch from 7dbef95 to 0e90915 Compare April 14, 2026 09:00
@ssss141414 ssss141414 merged commit 6e82bb0 into main Apr 20, 2026
9 checks passed
@ssss141414 ssss141414 deleted the fix/qnn-6d-reshape-autoconf branch April 20, 2026 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants