Sync onnxrt main to ROCm 7.1 #178

TedThemistokleous · 2025-09-24T19:25:00Z

Description

Grab Changes for fp4 from mainline and update 1.23 related build items

Motivation and Context

Get additional items like fp4 tensor types that tie into our delivery of fp4 in MIGraphX

- DynamicQuantizeMatMul - handle case where B zero point input is provided but not constant. (microsoft#25544) - Refactor plugin EP support (microsoft#25541) - Remove the python installation steps from win-qnn-arm64-ci-pipeline.yml (microsoft#25552) - [EP ABI] Node_GetAttrByName returns ORT_NOT_FOUND with non-existing attr name (microsoft#25565) - Fix C/C++ documentation generation (microsoft#25569) - [build] fix multi-config for VCPKG (microsoft#25585)

…1.23.0 release branch (microsoft#25606) ### Description Cherry-pick the microsoft#25566 for ORT 1.23

This PR cherry-picks some pipeline changes from the main branch to the 1.23.0 release branch. - **[build] disable CodeQL for NPM Packaging Pipeline (microsoft#25614)** - **Refactor Java Test Pipeline (microsoft#25608)** - **[build] upgrade Node.js for NPM packaging pipeline (microsoft#25568)** And a WebGPU change: - **[webgpu] Apply Flash Attention if sliding window exceeds KV cache length (microsoft#25594)**

) ### Description  Move moving weights to memory to the end of Graph::Resolve(). Modify Inject so it copies data into TensorProto according to the C API docs. ### Motivation and Context  TypeAndShape inference runs as a part of `Resolve()` and it unable to inspect and load the initializers that point to OrtValues at that time. We choose to move TensorProto to OrtValue conversion at the end of `Resolve()`. References: microsoft#25579 Co-authored-by: Dmitri Smirnov <[email protected]>

…#25659) Cherry-pick MiGraphX EP fixes from upstream for rel-1.23.0 This PR cherry-picks three critical fixes for the MiGraphX Execution Provider: 1. Fix compilation after cherry-picking from win-onnxruntime (microsoft#25516) - Adds ORT_UNUSED_PARAMETER(num_devices) to fix unused parameter warning - Corrects struct usage in CreateIExecutionProvider method 2. Fix CreateExecutionProviderFactory with correct struct and change vendor_id (microsoft#25625) - Updates vendor_id from 0x1002 to 0x9999 to allow DML EP to be default - Ensures proper device ordering in provider_policy_context.cc 3. Update OrtEpFactory in MiGraphX EP (microsoft#25567) - Adds complete OrtEpFactory infrastructure for auto EP selection - Implements all required factory methods with noexcept specifiers - Sets ort_version_supported to ORT_API_VERSION - Enables MiGraphX/AMDGPU EP integration with hardware device detection These fixes ensure MiGraphX EP builds correctly and integrates properly with the ORT execution provider selection framework in the 1.23.0 release. Cherry-picked commits: - 87f1499 - 14ca6df - 131cf40 --------- Co-authored-by: Artur Wojcik <[email protected]> Co-authored-by: Owen Zhang <[email protected]> Co-authored-by: ozhang <[email protected]>

…5, 25652 (microsoft#25701) ### Description Cherry-pick the following PRs into the `rel-1.23.0` branch: - microsoft#25391 - microsoft#25611 - microsoft#25656 - microsoft#25346 - microsoft#25374 - microsoft#25664 - microsoft#25675 - microsoft#25652 ### Motivation and Context  --------- Co-authored-by: Yulong Wang <[email protected]> Co-authored-by: Ishwar Raut <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Gaurav Garg <[email protected]> Co-authored-by: Scott McKay <[email protected]> Co-authored-by: Chi Lo <[email protected]> Co-authored-by: Abhishek Jindal <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]>

### Description Cherry-pick the following PRs into `rel-1.23.0`: - microsoft#25629 - microsoft#25583 ### Motivation and Context  --------- Co-authored-by: Chunye Wang@AMD <[email protected]> Co-authored-by: mingyue <[email protected]> Co-authored-by: Artur Wojcik <[email protected]> Co-authored-by: urpetkov-amd <[email protected]> Co-authored-by: Ted Themistokleous <[email protected]> Co-authored-by: Ted Themistokleous <[email protected]> Co-authored-by: Scott McKay <[email protected]>

### Description Cherry-picks microsoft#25725 into the `rel-1.23.0` branch. ### Motivation and Context  Co-authored-by: Ankit Maheshkar <[email protected]> Co-authored-by: jatinwadhwa921 <[email protected]>

### Description Cherry-pick the following PRs into the `rel-1.23.0` branch: - microsoft#25592 - microsoft#25622 - microsoft#25688 - microsoft#25729 - microsoft#25743 - microsoft#25769 - microsoft#25745 - microsoft#25761 - microsoft#25751 - microsoft#25716 - microsoft#25228 - microsoft#25768 - microsoft#25788 - microsoft#25747 - microsoft#25800 - microsoft#25818 - microsoft#25762 - microsoft#25749 - microsoft#25831 ### Motivation and Context  --------- Co-authored-by: quic-tirupath <[email protected]> Co-authored-by: quic-calvnguy <[email protected]> Co-authored-by: qti-kromero <[email protected]> Co-authored-by: Jeff Kilpatrick <[email protected]> Co-authored-by: Scott McKay <[email protected]> Co-authored-by: David Fan <[email protected]> Co-authored-by: kuanyul-qti <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: Chi Lo <[email protected]> Co-authored-by: Edward Chen <[email protected]> Co-authored-by: Chunye Wang@AMD <[email protected]> Co-authored-by: minfhong-qti <[email protected]> Co-authored-by: Vishal Agarwal <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Changming Sun <[email protected]> Co-authored-by: adrastogi <[email protected]> Co-authored-by: Aditya Rastogi <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

- **Relax WeightBiasQuantization constraint for larger QDQ node group (microsoft#25673)** - **Add cuda graph implementation for NV TRT RTX EP (microsoft#25787)** - **python GPU IO Bindings for NVIDIA (microsoft#25776)** - **Fixes for DynamicQuantizeMatMul and Attention3D tests (microsoft#25814)** - **Fix a long standing bug on file memory mapping on windows. (microsoft#25833)** - **Add API for precompiled model compatibility check using just the compat info (microsoft#25841)** - **Enable ABSL_FLAGS flag registration for onnxruntime_perf_test for mobile build (microsoft#25849)** - **Add default constructor to Ort::Status. (microsoft#25860)** - microsoft#25871 - microsoft#25878 - microsoft#25884 - microsoft#25886 - microsoft#25866

### Description Cherry-pick the following PRs: microsoft#25943 microsoft#25937 microsoft#25917 microsoft#25909 microsoft#25898 microsoft#25897 microsoft#25888 microsoft#25881 microsoft#25830 microsoft#25619 microsoft#25575 microsoft#25572 microsoft#25558 microsoft#25530 microsoft#25474 microsoft#25455 microsoft#25110 Also two dependent PRs for qMoE cpu: microsoft#25877 microsoft#25822 --------- Co-authored-by: xiaomsft <[email protected]> Co-authored-by: Xiaoyan Hu <[email protected]> Co-authored-by: Akshay Sonawane <[email protected]> Co-authored-by: Kunal Vaishnavi <[email protected]> Co-authored-by: Pradeep Sakhamoori <[email protected]> Co-authored-by: mingyue <[email protected]> Co-authored-by: Maximilian Müller <[email protected]> Co-authored-by: Adrian Lizarraga <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: Emmanuel <[email protected]> Co-authored-by: Emmanuel Assumang <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: praneshgo <[email protected]> Co-authored-by: Hariharan Seshadri <[email protected]> Co-authored-by: Jing Fang <[email protected]> Co-authored-by: Ishwar Raut <[email protected]>

This PR cherry-picks the following PRs to the rel-1.23.0 branch: * microsoft#25938 * microsoft#25957 * microsoft#25960 * microsoft#25968 * microsoft#25971 --------- Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: Adrian Lizarraga <[email protected]> Co-authored-by: Hariharan Seshadri <[email protected]>

This PR cherry-picks the following PRs to the release branch: - microsoft#25988 - microsoft#25991 --------- Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: umangb-09 <[email protected]>

This PR cherry-picks several commits from the main branch to the rel-1.23.0 release branch as part of the release process. ### Changes included: * **Major Refactoring of Azure DevOps Pipelines (microsoft#26008)** * Commit: `2e6d7ccfdff55aaf7b0799d7e28b041e607dce2b` * **Disables failing test to unblock Python DML Pipeline (microsoft#26043)** * Commit: `64c8f40d01bf14b3cf7ac4cf8606ad9e0e56feb0` * **Pin cmake version in macOS github Actions (microsoft#25998)** * Commit: `148f13cc6b44cae156226cd4e0dcfc154691c5b4` * **Bump actions/setup-python from 5 to 6 (microsoft#25979)** * Commit: `97a8d332595c974ad24be133df216565493ffb95` * **Remove CACHE_URL settings from Github Actions (microsoft#25989)** * Commit: `e2a0999ba4b224ab90ef7a8768dd4941fcc19b17` * **Bump actions/checkout from 4 to 5 (microsoft#25771)** * Commit: `f19215db21f8e1a8fc93090748e455f41076f456` * **Bump ruff from 0.12.8 to 0.12.9 (microsoft#25772)** * Commit: `78df404871fa2f3fbbb7f1902f9623787ba8dc86` * **Bump ruff from 0.12.4 to 0.12.8 (microsoft#25713)** * Commit: `7204746e709005d2c7294e7a24d63a2df4a1aee8` * **Update macOS target version from 13.3 to 13.4 (microsoft#25616)** * Commit: `65bd82564cd31e0acf9139cdd826d08193212c6e` --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Prathik Rao <[email protected]>

- **Major Refactoring of Azure DevOps Pipelines (microsoft#26008)** - **Convert QNN x64 pipeline to GH (microsoft#26047)** - **Upgrade com.diffplug.spotless to 7.2.1 in Java build (microsoft#26051)** - **Fix Mac Catalyst build options. (microsoft#25970)** - **Update Podfile.template: update macOS target version from 13.3 to 13.4 (microsoft#25699) **

… (microsoft#26087) Reduce Python and Nuget GPU package size (microsoft#26002) [CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM (microsoft#25802)

TedThemistokleous · 2025-09-24T19:28:23Z

Should be rebase point for - #176 then as we tie fp4 types

### Description 1. Fixes Python Wheel Installation Path: In the Linux smoking test (py-package-smoking-test-linux.yml), the pip install command was corrected to use --find-links . to locate the wheel in the correct directory. This resolves an issue where the installation script was looking in the wrong location. 2. Expands python package test pipeline's macOS Test Matrix: A new parameterized template (py-package-smoking-test-macos.yml) is introduced to test macOS wheels. The main pipeline (py-package-test-pipeline.yml) now uses this template to create a comprehensive test matrix, covering Python versions 3.10, 3.11, 3.12, and 3.13 across macOS versions 13, 14, and 15. 3. Enable more tests in Nuget Test Pipeline. The pipeline is for testing packaged ONNX Runtime nuget packages. In the Windows NuGet test template (test_win.yml), a step has been added to download and place the custom_op_library.dll in the appropriate test directory. This ensures the custom op tests can find their required dependencies. The SKIPNONPACKAGETESTS flag has been removed to ensure all such tests are run. ### Motivation and Context Improve packaging tests.

TedThemistokleous · 2025-10-01T18:09:32Z

Will close this out and delete branch - Going to use the official 1.23.0 tag upon release. MSFT has latest changes capturedby Artur and I in that tag

fs-eire and others added 18 commits July 26, 2025 21:36

Cherry-pick microsoft#25548 into 1.23 release branch. (microsoft#25549)

316ac6a

Cherry-pick "[EP ABI] Support for TENSOR type attribute" PR into ORT …

0e3e1a5

…1.23.0 release branch (microsoft#25606) ### Description Cherry-pick the microsoft#25566 for ORT 1.23

Cherry-pick release:1.23.0 PRs to rel-1.23.0 (microsoft#26003)

a922003

This PR cherry-picks the following PRs to the release branch: - microsoft#25988 - microsoft#25991 --------- Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: umangb-09 <[email protected]>

Pin CMake version in ReactNative_CI_iOS job (microsoft#26086)

d1cfdf0

Cherry-pick: Reduce Python and Nuget GPU package size (microsoft#26002)…

2a034d5

… (microsoft#26087) Reduce Python and Nuget GPU package size (microsoft#26002) [CUDA] Add build flag onnxruntime_USE_FPA_INTB_GEMM (microsoft#25802)

TedThemistokleous self-assigned this Sep 24, 2025

TedThemistokleous requested review from CharlieL7 and causten September 24, 2025 19:25

TedThemistokleous added the Roadmap Item within release roadmap label Sep 24, 2025

TedThemistokleous force-pushed the sync_onnxrt_main branch from a4381f1 to be835ef Compare October 1, 2025 18:08

TedThemistokleous closed this Oct 1, 2025

TedThemistokleous deleted the sync_onnxrt_main branch October 1, 2025 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync onnxrt main to ROCm 7.1 #178

Sync onnxrt main to ROCm 7.1 #178

Uh oh!

TedThemistokleous commented Sep 24, 2025

Uh oh!

TedThemistokleous commented Sep 24, 2025

Uh oh!

TedThemistokleous commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Sync onnxrt main to ROCm 7.1 #178

Sync onnxrt main to ROCm 7.1 #178

Uh oh!

Conversation

TedThemistokleous commented Sep 24, 2025

Description

Motivation and Context

Uh oh!

TedThemistokleous commented Sep 24, 2025

Uh oh!

TedThemistokleous commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants