Skip to content

bug(export): _normalize_exported_model overwrites export.onnx with duplicate node metadata_props after ORT node fusion #696

Description

@DingmaomaoBJTU

Summary

_normalize_exported_model (introduced in #681) runs ORT graph optimization on an already-tagged ONNX model. ORT's node fusion duplicates winml.hierarchy.* keys in the fused node's metadata_props, then the broken model is copied back over the original export.onnx, causing the subsequent main optimize step to fail with ModelValidationError: duplicate keys in metadata_props.

Context

PR #681 added a post-export normalization step (_normalize_exported_model) inside export_pytorch() to run shape inference on freshly exported models. However, by the time normalization runs, HTPExporter._embed_tags_in_onnx has already injected winml.hierarchy.tag and winml.hierarchy.depth into every ONNX node's metadata_props. When ORT's graph optimizer fuses multiple nodes (e.g. Gelu fusion, MatMul+Add fusion), it merges the source nodes' metadata_props into the fused node, creating duplicate keys. The ONNX spec forbids duplicate keys in metadata_props (model-level and node-level), so the next call to onnx.checker.check_model raises an error.

Observed on BAAI/bge-large-en-v1.5 (BERT, feature-extraction), which triggers both Gelu and MatMul+Add fusions.

Current State

Failing call chain:

perf.py:1369  perf()
perf.py:290   benchmark.run()
perf.py:354   _load_model()
auto.py:424   WinMLAutoModel.from_pretrained() → build_hf_model()
hf.py:271     run_optimize_analyze_loop(model_path=export_path, ...)
common.py:83  optimize_onnx(model=export_path, ...)
api.py:234    _load_model(export_path)
api.py:68     → raise ModelValidationError(
                  "Failed to load ONNX model",
                  "Validation error: Your model has duplicate keys in metadata_props.")

Root-cause sequence:

  1. exporter.py:594-595_embed_tags_in_onnx adds winml.hierarchy.tag + winml.hierarchy.depth to every graph node, then saves to export.onnx.
  2. pytorch.py:148_normalize_exported_model(output_path) calls optimize_onnx(model=output_path, output=tmp_path).
  3. optim/api.py:234_load_model(output_path) passes (onnx.checker finds no duplicates yet).
  4. optim/pipes/graph.py:533,571,587ORTGraphPipe.process saves the model, creates an ort.InferenceSession (triggering ORT graph optimizations including Gelu/MatMul+Add fusions), then reloads with validate=False. ORT copies all source-node metadata_props into the fused node → duplicate winml.hierarchy.tag keys on fused nodes. The duplicates are invisible here because validation is skipped.
  5. optim/api.py:276_hack_inject_quant_preprocess_metadata calls onnx.helper.set_model_props which only cleans model-level metadata_props; node-level duplicates survive.
  6. optim/api.py:284save_onnx(optimized_model, tmp_path) writes the broken model to tmp_path.
  7. pytorch.py:149copy_onnx_model(tmp_path, output_path) overwrites export.onnx with the broken model.
  8. Main build run_optimize_analyze_loop tries to load export.onnx with validate=Truecrash.

Desired State

_normalize_exported_model either:

  • Does not feed hierarchy-tagged nodes through ORT node fusion, or
  • Ensures the normalized model written back to export.onnx has no duplicate node-level metadata_props

so that the main optimize step can always load export.onnx without a checker error.

Acceptance Criteria

  • Building BAAI/bge-large-en-v1.5 (feature-extraction, QNN EP) succeeds end-to-end without ModelValidationError
  • _normalize_exported_model still runs shape inference / graph normalization on the raw ONNX export
  • winml.hierarchy.tag / winml.hierarchy.depth node metadata is present and correct in the final artifact
  • No regression on models that don't trigger node fusion
  • Relevant pytest cases pass: tests/unit/export/test_pytorch_export.py, tests/integration/ (or equivalent scope)

Technical Notes

Two candidate fixes, either works:

Option A — Move tag injection after normalization
In HTPExporter.export() (src/winml/modelkit/export/htp/exporter.py), swap the order so _normalize_exported_model (in export_pytorch) runs on the untagged ONNX, and only then inject hierarchy tags. This is the cleanest approach: the ORT optimizer never sees winml.* node props.

Option B — Strip/deduplicate node metadata_props before copying back
In _normalize_exported_model (src/winml/modelkit/export/pytorch.py), after optimize_onnx writes to tmp_path, load tmp_path and remove duplicate node-level winml.* keys before calling copy_onnx_model. More surgical but adds complexity.

Note: simply switching to validate=False in the main optimize_onnx load would hide the symptom, not fix the corruption.

Related Files

  • src/winml/modelkit/export/pytorch.py:105-158_normalize_exported_model, introduced in feat(export): normalize exported ONNX in-place via optimize_onnx #681
  • src/winml/modelkit/export/htp/exporter.py:568-600_embed_tags_in_onnx + _embed_graph_metadata (step 7 of export)
  • src/winml/modelkit/export/htp/exporter.py:295-297 — call site for both step-7 functions
  • src/winml/modelkit/optim/pipes/graph.py:491-590ORTGraphPipe.process (ORT node fusion, reloads with validate=False)
  • src/winml/modelkit/optim/api.py:163-173_hack_inject_quant_preprocess_metadata (only fixes model-level props)
  • src/winml/modelkit/onnx/persistence.py:28-68load_onnx (validate=True by default)
  • tests/unit/export/test_pytorch_export.py — existing normalization tests to extend

References

  • Introduced by: feat(export): normalize exported ONNX in-place via optimize_onnx #681 (feat(export): normalize exported ONNX in-place via optimize_onnx)
  • ONNX spec: metadata_props keys must be unique at both model level and node level
  • ORT behavior: node fusion copies all source metadata_props to the fused node (observed with Gelu and MatMul+Add fusions on BERT)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggraph-optimizerGraph optimizer module

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions