-
Notifications
You must be signed in to change notification settings - Fork 99
docs: add ONNXRuntime QNN Execution Provider zh docs #1203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,303 @@ | ||
| ONNX Runtime 的 **[QNN Execution Provider](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html)** 可在高通 SoC 平台上启用 **NPU 硬件加速推理 ONNX 格式模型**。 | ||
| 它使用 **[Qualcomm® AI Runtime (QAIRT SDK)](./qairt-sdk#qairt)**,将 **ONNX 模型** 构建为一个 **QNN 计算图**,并通过 **加速器后端库** 来执行该计算图。 | ||
| ONNX Runtime 的 **QNN Execution Provider** 可用于搭载 **高通平台 SoC** 的 **Linux**, **Android**, **Windows** 设备。 | ||
|
|
||
| ## 支持设备 | ||
|
|
||
| - [**瑞莎 Dragon Q6A**](/dragon/q6a/) (Linux) | ||
|
|
||
| - [**瑞莎 Fogwise AIRbox Q900**](/fogwise/airbox-q900) (Linux) | ||
|
|
||
| ## 安装方法 | ||
|
|
||
| :::tip | ||
| 安装方式有两种,可选择 **pip 安装** 与 **源码编译安装** | ||
|
|
||
| 无论选择哪种方式都需要按照 [**QAIRT SDK 安装**](./qairt-install) 下载 QAIRT SDK | ||
| ::: | ||
|
|
||
| ### 创建 python 虚拟环境 | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| sudo apt install python3-venv | ||
| python3 -m venv .venv | ||
| source .venv/bin/activate | ||
| pip3 install --upgrade pip | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| ### pip 安装 | ||
|
|
||
| radxa 已经提供预编译 Linux 版本的 `onnxruntime-qnn whl` 文件 | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| pip3 install https://github.com/ZIFENG278/onnxruntime/releases/download/v1.23.2/onnxruntime_qnn-1.23.2-cp312-cp312-linux_aarch64.whl | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| ### 源码编译 | ||
|
|
||
| #### 克隆 onnxruntime 仓库 | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| git clone --depth 1 -b v1.23.2 https://github.com/microsoft/onnxruntime.git | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| #### 修改 CMakeLists.txt | ||
|
|
||
| 因为 onnxruntime 不直接支持 Linux 系统,若编译为支持 Linux 系统的 onnxruntime-qnn whl 包,需要手动更改 `cmake/CMakeLists.txt` 的 840 行。 | ||
|
|
||
| 将 L840 `set(QNN_ARCH_ABI aarch64-android)` 修改为 `set(QNN_ARCH_ABI aarch64-oe-linux-gcc11.2)` | ||
|
|
||
| :::tip | ||
| 若编译为 **Android** 或 **Windows** 则不需要修改。 | ||
| ::: | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| cd onnxruntime | ||
| vim cmake/CMakeLists.txt | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
| ```bash | ||
| diff --git a/cmake/CMakeLists.txt b/cmake/CMakeLists.txt | ||
| index 0b37ade..f4621e5 100644 | ||
| --- a/cmake/CMakeLists.txt | ||
| +++ b/cmake/CMakeLists.txt | ||
| @@ -837,7 +837,7 @@ if (onnxruntime_USE_QNN OR onnxruntime_USE_QNN_INTERFACE) | ||
| if (${GEN_PLATFORM} STREQUAL "x86_64") | ||
| set(QNN_ARCH_ABI x86_64-linux-clang) | ||
| else() | ||
| - set(QNN_ARCH_ABI aarch64-android) | ||
| + set(QNN_ARCH_ABI aarch64-oe-linux-gcc11.2) | ||
| endif() | ||
| endif() | ||
| endif() | ||
| ``` | ||
|
|
||
| #### 编译项目 | ||
|
|
||
| :::tip | ||
| 请根据实际 QAIRT SDK 路径修改 `QNN_SDK_PATH` 路径 | ||
| ::: | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| pip3 install -r requirements.txt | ||
| ./build.sh --use_qnn --qnn_home [QNN_SDK_PATH] --build_shared_lib --build_wheel --config Release --parallel --skip_tests --build_dir build/Linux | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| 项目编译完成后,目标 whl 包生成在 `build/Linux/Release/dist` 下 | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| pip3 install ./build/Linux/Release/dist/onnxruntime_qnn-1.23.2-cp312-cp312-linux_aarch64.whl | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| ## 验证 QNN Execution Provider | ||
|
|
||
| :::tip | ||
| 验证 QNN Execution Provider 前,请先按照 [**板端启用 NPU**](./fastrpc_setup) 与 [**NPU 快速验证**](./quick-example) 验证 NPU 可用性,确保 NPU 功能正常后再测试 QNN Execution Provider。 | ||
| ::: | ||
|
|
||
| ### 导入环境变量 | ||
|
|
||
| <Tabs> | ||
| <TabItem value="QCS6490"> | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
| ```bash | ||
| export PRODUCT_SOC=6490 DSP_ARCH=68 | ||
| ``` | ||
| </NewCodeBlock> | ||
|
|
||
|
|
||
| </TabItem> | ||
|
|
||
| <TabItem value="QCS9075"> | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
| ```bash | ||
| export PRODUCT_SOC=9075 DSP_ARCH=73 | ||
| ``` | ||
| </NewCodeBlock> | ||
|
|
||
| </TabItem> | ||
|
|
||
| </Tabs> | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| cd qairt/2.37.1.250807 | ||
| source bin/envsetup.sh | ||
| export ADSP_LIBRARY_PATH=$QNN_SDK_ROOT/lib/hexagon-v${DSP_ARCH}/unsigned | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| ### 下载 INT8 量化的 ONNX 模型 | ||
|
|
||
| 在 [**Qualcomm AI Hub**](https://aihub.qualcomm.com/iot/models/resnet50?chipsets=qualcomm-qcs6490) 上下载一个 ONNX Runtime 格式的 w8a8 量化模型 | ||
|
|
||
| <div style={{ textAlign: "center" }}> | ||
| <img | ||
| src="/img/dragon/q6a/qaihub_onnxruntime_model.webp" | ||
| style={{ width: "85%" }} | ||
| /> | ||
| </div> | ||
|
|
||
| ### 测试 QNN Execution Provider | ||
|
|
||
| 以下 Python 代码使用 QNN EP 创建 ONNX Runtime 会话,并使用 NPU 推理 w8a8 量化 ONNX 模型。代码参考 [Running a quantized model on Windows ARM64](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#running-a-quantized-model-on-windows-arm64-onnxruntime-qnn-version--1180) | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| vim run_qdq_model.py | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| :::tip | ||
| 请根据实际 QAIRT SDK 路径修改 `backend_path` 路径 | ||
|
|
||
| 请根据下载的 onnx 模型路径修改 `InferenceSession` 中的模型路径参数 | ||
| ::: | ||
|
|
||
| ```python | ||
| # run_qdq_model.py | ||
|
|
||
| import onnxruntime | ||
| import numpy as np | ||
|
|
||
| options = onnxruntime.SessionOptions() | ||
|
|
||
| # (Optional) Enable configuration that raises an exception if the model can't be | ||
| # run entirely on the QNN HTP backend. | ||
| options.add_session_config_entry("session.disable_cpu_ep_fallback", "1") | ||
|
|
||
| # Create an ONNX Runtime session. | ||
| # TODO: Provide the path to your ONNX model | ||
| session = onnxruntime.InferenceSession("job_jpy6ye005_optimized_onnx/model.onnx", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The example code uses a hardcoded model path
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add tip. done |
||
| sess_options=options, | ||
| providers=["QNNExecutionProvider"], | ||
| provider_options=[{"backend_path": "libQnnHtp.so"}]) # Provide path to Htp dll in QNN SDK | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add tip, done |
||
|
|
||
| # Run the model with your input. | ||
| # TODO: Use numpy to load your actual input from a file or generate random input. | ||
| input0 = np.ones((1,3,224,224), dtype=np.uint8) | ||
| result = session.run(None, {"image_tensor": input0}) | ||
|
|
||
| # Print output. | ||
| print(result) | ||
| ``` | ||
|
|
||
| <NewCodeBlock tip="Device" type="device"> | ||
|
|
||
| ```bash | ||
| python3 run_qdq_model.py | ||
| ``` | ||
|
|
||
| </NewCodeBlock> | ||
|
|
||
| ```bash | ||
| (.venv) rock@radxa-dragon-q6a:~/ssd/qualcomm/onnxruntime/build/Linux/Release$ python3 run_qdq_model.py | ||
| 2025-12-22 06:31:37.527811909 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor" | ||
| /prj/qct/webtech_scratch20/mlg_user_admin/qaisw_source_repo/rel/qairt-2.37.1/point_release/SNPE_SRC/avante-tools/prebuilt/dsp/hexagon-sdk-5.4.0/ipc/fastrpc/rpcmem/src/rpcmem_android.c:38:dummy call to rpcmem_init, rpcmem APIs will be used from libxdsprpc | ||
|
|
||
| ====== DDR bandwidth summary ====== | ||
| spill_bytes=0 | ||
| fill_bytes=0 | ||
| write_total_bytes=65536 | ||
| read_total_bytes=25976832 | ||
|
|
||
| [array([[47, 55, 49, 46, 56, 55, 45, 47, 44, 49, 50, 46, 44, 46, 44, 47, | ||
| 46, 46, 45, 45, 47, 49, 55, 48, 47, 49, 48, 49, 52, 54, 46, 61, | ||
| 51, 48, 57, 41, 47, 41, 62, 50, 50, 44, 49, 48, 48, 50, 51, 52, | ||
| 46, 43, 47, 51, 51, 52, 52, 54, 44, 40, 44, 65, 60, 49, 52, 57, | ||
| 55, 50, 55, 47, 55, 57, 47, 62, 44, 62, 44, 51, 50, 53, 57, 57, | ||
| 47, 51, 46, 39, 45, 44, 45, 46, 49, 52, 46, 42, 50, 48, 54, 42, | ||
| 50, 36, 43, 47, 48, 44, 43, 54, 46, 41, 46, 63, 52, 46, 51, 75, | ||
| 58, 58, 51, 49, 50, 64, 41, 44, 49, 43, 45, 47, 48, 50, 62, 50, | ||
| 52, 52, 49, 44, 54, 41, 43, 46, 40, 42, 40, 41, 45, 46, 41, 46, | ||
| 46, 47, 45, 49, 46, 50, 43, 51, 50, 52, 46, 47, 45, 43, 48, 46, | ||
| 42, 48, 48, 52, 47, 47, 47, 47, 44, 45, 44, 47, 46, 49, 39, 45, | ||
| 43, 45, 53, 44, 45, 47, 43, 44, 46, 48, 44, 51, 45, 48, 50, 46, | ||
| 41, 44, 46, 52, 45, 38, 42, 44, 44, 41, 42, 51, 53, 37, 43, 48, | ||
| 48, 44, 40, 43, 43, 44, 43, 46, 50, 45, 42, 46, 50, 48, 48, 50, | ||
| 49, 42, 41, 41, 47, 45, 43, 46, 48, 47, 44, 46, 48, 45, 45, 48, | ||
| 42, 45, 44, 42, 46, 46, 48, 45, 44, 43, 50, 49, 48, 45, 52, 36, | ||
| 42, 47, 47, 46, 49, 42, 50, 43, 48, 47, 48, 43, 44, 48, 51, 47, | ||
| 48, 43, 47, 45, 50, 55, 47, 50, 50, 53, 48, 57, 51, 58, 46, 46, | ||
| 53, 48, 45, 48, 44, 50, 47, 43, 47, 48, 47, 53, 47, 54, 44, 53, | ||
| 47, 45, 56, 58, 57, 46, 57, 51, 56, 55, 58, 58, 52, 55, 59, 53, | ||
| 50, 42, 40, 46, 51, 44, 56, 51, 52, 42, 44, 50, 49, 48, 43, 45, | ||
| 42, 45, 47, 42, 46, 46, 42, 39, 39, 47, 41, 45, 45, 46, 48, 47, | ||
| 44, 47, 49, 46, 52, 45, 50, 50, 45, 52, 52, 49, 52, 47, 45, 50, | ||
| 44, 44, 44, 45, 41, 45, 45, 44, 50, 50, 48, 41, 49, 45, 46, 46, | ||
| 46, 47, 41, 45, 44, 52, 48, 43, 50, 45, 47, 50, 48, 52, 54, 64, | ||
| 50, 62, 61, 48, 45, 52, 45, 45, 44, 64, 42, 47, 48, 60, 47, 43, | ||
| 67, 54, 63, 63, 52, 60, 54, 55, 51, 50, 53, 55, 46, 61, 51, 45, | ||
| 58, 53, 49, 57, 45, 57, 64, 53, 56, 60, 59, 52, 47, 51, 59, 55, | ||
| 49, 46, 42, 60, 46, 51, 40, 54, 54, 61, 44, 56, 44, 55, 58, 55, | ||
| 60, 60, 48, 44, 53, 58, 68, 50, 43, 63, 46, 54, 40, 52, 54, 60, | ||
| 55, 62, 57, 49, 44, 58, 59, 62, 64, 46, 55, 57, 53, 49, 55, 46, | ||
| 48, 54, 59, 68, 49, 56, 51, 61, 61, 52, 57, 61, 60, 39, 50, 44, | ||
| 63, 64, 48, 57, 52, 57, 51, 52, 44, 46, 49, 56, 51, 43, 53, 60, | ||
| 57, 55, 71, 62, 43, 47, 58, 52, 45, 41, 53, 59, 48, 56, 64, 57, | ||
| 51, 54, 61, 41, 45, 59, 54, 59, 58, 54, 43, 44, 52, 56, 59, 55, | ||
| 52, 48, 57, 60, 43, 45, 51, 57, 52, 46, 61, 48, 60, 48, 64, 42, | ||
| 45, 57, 53, 59, 48, 48, 46, 62, 58, 60, 43, 61, 50, 49, 53, 55, | ||
| 55, 64, 57, 43, 62, 51, 54, 56, 63, 53, 62, 39, 70, 61, 61, 64, | ||
| 55, 45, 54, 51, 44, 56, 51, 55, 63, 54, 58, 67, 55, 46, 61, 63, | ||
| 40, 41, 73, 50, 51, 66, 51, 57, 58, 61, 39, 59, 52, 49, 53, 43, | ||
| 45, 62, 55, 64, 77, 44, 52, 55, 48, 51, 69, 54, 53, 55, 47, 46, | ||
| 44, 51, 52, 51, 45, 39, 55, 49, 60, 45, 65, 53, 51, 41, 45, 46, | ||
| 52, 60, 65, 42, 48, 65, 58, 59, 59, 60, 54, 58, 61, 60, 59, 48, | ||
| 57, 48, 38, 54, 60, 50, 45, 60, 66, 49, 49, 62, 60, 52, 54, 49, | ||
| 54, 41, 40, 53, 57, 53, 60, 68, 56, 57, 66, 47, 54, 41, 47, 59, | ||
| 69, 43, 63, 52, 49, 60, 52, 51, 53, 50, 46, 62, 55, 56, 44, 49, | ||
| 62, 59, 52, 51, 56, 53, 50, 53, 56, 59, 52, 58, 52, 64, 47, 49, | ||
| 52, 57, 60, 54, 48, 39, 51, 58, 60, 66, 40, 61, 57, 50, 49, 65, | ||
| 48, 66, 56, 53, 66, 60, 54, 48, 66, 56, 58, 46, 49, 53, 57, 63, | ||
| 63, 57, 50, 52, 36, 60, 48, 51, 57, 52, 48, 50, 58, 49, 56, 54, | ||
| 51, 46, 46, 44, 62, 48, 56, 47, 49, 54, 54, 49, 59, 61, 47, 48, | ||
| 43, 47, 72, 57, 42, 49, 53, 57, 49, 47, 70, 57, 61, 43, 49, 54, | ||
| 51, 47, 58, 48, 59, 62, 52, 56, 54, 54, 48, 48, 58, 70, 65, 45, | ||
| 56, 55, 55, 61, 69, 44, 68, 64, 40, 55, 51, 53, 50, 57, 62, 53, | ||
| 46, 36, 45, 51, 50, 51, 46, 45, 57, 55, 37, 61, 53, 52, 53, 57, | ||
| 55, 60, 51, 64, 49, 56, 56, 48, 53, 60, 45, 47, 59, 58, 51, 47, | ||
| 60, 53, 61, 57, 52, 61, 64, 57, 53, 62, 60, 58, 57, 50, 54, 48, | ||
| 39, 47, 41, 41, 65, 47, 52, 57, 59, 50, 47, 47, 49, 47, 45, 52, | ||
| 49, 56, 50, 47, 49, 46, 51, 48, 49, 53, 52, 39, 49, 42, 50, 50, | ||
| 46, 55, 48, 46, 62, 58, 59, 59, 51, 51, 59, 45, 52, 54, 51, 49, | ||
| 51, 48, 47, 48, 48, 48, 66, 60, 66, 57, 45, 61, 44, 40, 52, 48, | ||
| 46, 43, 52, 43, 56, 52, 50, 52, 48, 61, 52, 52, 53, 45, 52, 45, | ||
| 42, 40, 43, 44, 40, 41, 51, 61]], dtype=uint8)] | ||
| /prj/qct/webtech_scratch20/mlg_user_admin/qaisw_source_repo/rel/qairt-2.37.1/point_release/SNPE_SRC/avante-tools/prebuilt/dsp/hexagon-sdk-5.4.0/ipc/fastrpc/rpcmem/src/rpcmem_android.c:42:dummy call to rpcmem_deinit, rpcmem APIs will be used from libxdsprpc | ||
| ``` | ||
|
|
||
| ## 详细文档 | ||
|
|
||
| 关于 QNNExecutionProvider 的详细使用方法请参考 | ||
|
|
||
| - [**QNN Execution Provider**](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#qnn-execution-provider) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| --- | ||
| sidebar_position: 10 | ||
| sidebar_position: 11 | ||
| --- | ||
|
|
||
| # Llama3.2-1B 大模型 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| --- | ||
| sidebar_position: 9 | ||
| sidebar_position: 99 | ||
| --- | ||
|
|
||
| # Demos 示例 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| --- | ||
| sidebar_position: 10 | ||
| --- | ||
|
|
||
| # QNN Execution Provider | ||
|
|
||
| import QNNONNXRTEXECUTIONPROVIDER from '../../../../common/ai/\_qnn_onnxrt_execution_provider.mdx'; | ||
|
|
||
| <QNNONNXRTEXECUTIONPROVIDER /> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| --- | ||
| sidebar_position: 9 | ||
| sidebar_position: 99 | ||
| --- | ||
|
|
||
| # Demos 示例 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| --- | ||
| sidebar_position: 10 | ||
| --- | ||
|
|
||
| # QNN Execution Provider | ||
|
|
||
| import QNNONNXRTEXECUTIONPROVIDER from '../../../common/ai/\_qnn_onnxrt_execution_provider.mdx'; | ||
|
|
||
| <QNNONNXRTEXECUTIONPROVIDER /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build.sh command contains
--qnn_home [QNN_SDK_PATH]but doesn't explain what value should replace[QNN_SDK_PATH]. Consider adding a note explaining this should be replaced with the path to the QAIRT SDK installation directory.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add tip. done