|
| 1 | +ONNX Runtime 的 **[QNN Execution Provider](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html)** 可在高通 SoC 平台上启用 **NPU 硬件加速推理 ONNX 格式模型**。 |
| 2 | +它使用 **[Qualcomm® AI Runtime (QAIRT SDK)](./qairt-sdk#qairt)**,将 **ONNX 模型** 构建为一个 **QNN 计算图**,并通过 **加速器后端库** 来执行该计算图。 |
| 3 | +ONNX Runtime 的 **QNN Execution Provider** 可用于搭载 **高通平台 SoC** 的 **Linux**, **Android**, **Windows** 设备。 |
| 4 | + |
| 5 | +## 支持设备 |
| 6 | + |
| 7 | +- [**瑞莎 Dragon Q6A**](/dragon/q6a/) (Linux) |
| 8 | + |
| 9 | +- [**瑞莎 Fogwise AIRbox Q900**](/fogwise/airbox-q900) (Linux) |
| 10 | + |
| 11 | +## 安装方法 |
| 12 | + |
| 13 | +:::tip |
| 14 | +安装方式有两种,可选择 **pip 安装** 与 **源码编译安装** |
| 15 | + |
| 16 | +无论选择哪种方式都需要按照 [**QAIRT SDK 安装**](./qairt-install) 下载 QAIRT SDK |
| 17 | +::: |
| 18 | + |
| 19 | +### 创建 python 虚拟环境 |
| 20 | + |
| 21 | +<NewCodeBlock tip="Device" type="device"> |
| 22 | + |
| 23 | +```bash |
| 24 | +sudo apt install python3-venv |
| 25 | +python3 -m venv .venv |
| 26 | +source .venv/bin/activate |
| 27 | +pip3 install --upgrade pip |
| 28 | +``` |
| 29 | + |
| 30 | +</NewCodeBlock> |
| 31 | + |
| 32 | +### pip 安装 |
| 33 | + |
| 34 | +radxa 已经提供预编译 Linux 版本的 `onnxruntime-qnn whl` 文件 |
| 35 | + |
| 36 | +<NewCodeBlock tip="Device" type="device"> |
| 37 | + |
| 38 | +```bash |
| 39 | +pip3 install https://github.com/ZIFENG278/onnxruntime/releases/download/v1.23.2/onnxruntime_qnn-1.23.2-cp312-cp312-linux_aarch64.whl |
| 40 | +``` |
| 41 | + |
| 42 | +</NewCodeBlock> |
| 43 | + |
| 44 | +### 源码编译 |
| 45 | + |
| 46 | +#### 克隆 onnxruntime 仓库 |
| 47 | + |
| 48 | +<NewCodeBlock tip="Device" type="device"> |
| 49 | + |
| 50 | +```bash |
| 51 | +git clone --depth 1 -b v1.23.2 https://github.com/microsoft/onnxruntime.git |
| 52 | +``` |
| 53 | + |
| 54 | +</NewCodeBlock> |
| 55 | + |
| 56 | +#### 修改 CMakeLists.txt |
| 57 | + |
| 58 | +因为 onnxruntime 不直接支持 Linux 系统,若编译为支持 Linux 系统的 onnxruntime-qnn whl 包,需要手动更改 `cmake/CMakeLists.txt` 的 840 行。 |
| 59 | + |
| 60 | +将 L840 `set(QNN_ARCH_ABI aarch64-android)` 修改为 `set(QNN_ARCH_ABI aarch64-oe-linux-gcc11.2)` |
| 61 | + |
| 62 | +:::tip |
| 63 | +若编译为 **Android** 或 **Windows** 则不需要修改。 |
| 64 | +::: |
| 65 | + |
| 66 | +<NewCodeBlock tip="Device" type="device"> |
| 67 | + |
| 68 | +```bash |
| 69 | +cd onnxruntime |
| 70 | +vim cmake/CMakeLists.txt |
| 71 | +``` |
| 72 | + |
| 73 | +</NewCodeBlock> |
| 74 | +```bash |
| 75 | +diff --git a/cmake/CMakeLists.txt b/cmake/CMakeLists.txt |
| 76 | +index 0b37ade..f4621e5 100644 |
| 77 | +--- a/cmake/CMakeLists.txt |
| 78 | ++++ b/cmake/CMakeLists.txt |
| 79 | +@@ -837,7 +837,7 @@ if (onnxruntime_USE_QNN OR onnxruntime_USE_QNN_INTERFACE) |
| 80 | + if (${GEN_PLATFORM} STREQUAL "x86_64") |
| 81 | + set(QNN_ARCH_ABI x86_64-linux-clang) |
| 82 | + else() |
| 83 | +- set(QNN_ARCH_ABI aarch64-android) |
| 84 | ++ set(QNN_ARCH_ABI aarch64-oe-linux-gcc11.2) |
| 85 | + endif() |
| 86 | + endif() |
| 87 | + endif() |
| 88 | +``` |
| 89 | +
|
| 90 | +#### 编译项目 |
| 91 | +
|
| 92 | +<NewCodeBlock tip="Device" type="device"> |
| 93 | +
|
| 94 | +```bash |
| 95 | +pip3 install -r requirement.txt |
| 96 | +./build.sh --use_qnn --qnn_home [QNN_SDK_PATH] --build_shared_lib --build_wheel --config Release --parallel --skip_tests --build_dir build/Linux |
| 97 | +``` |
| 98 | +
|
| 99 | +</NewCodeBlock> |
| 100 | +
|
| 101 | +项目编译完成后,目标 whl 包生成在 `build/Linux/Release/dist` 下 |
| 102 | +
|
| 103 | +<NewCodeBlock tip="Device" type="device"> |
| 104 | +
|
| 105 | +```bash |
| 106 | +pip3 install ./build/Linux/Release/dist/onnxruntime_qnn-1.23.2-cp312-cp312-linux_aarch64.whl |
| 107 | +``` |
| 108 | +
|
| 109 | +</NewCodeBlock> |
| 110 | +
|
| 111 | +## 验证 QNN Execution Provider |
| 112 | +
|
| 113 | +:::tip |
| 114 | +验证 QNN Execution Provider 前,请先按照 [**板端启用 NPU**](./fastrpc_setup) 与 [**NPU 快速验证**](./quick-example) 验证 NPU 可用性,确保 NPU 功能正常后再测试 QNN Execution Provider。 |
| 115 | +::: |
| 116 | +
|
| 117 | +### 导入环境变量 |
| 118 | +
|
| 119 | +<Tabs> |
| 120 | + <TabItem value="QCS6490"> |
| 121 | +
|
| 122 | + <NewCodeBlock tip="Device" type="device"> |
| 123 | + ```bash |
| 124 | + export PRODUCT_SOC=6490 DSP_ARCH=68 |
| 125 | + ``` |
| 126 | + </NewCodeBlock> |
| 127 | +
|
| 128 | +
|
| 129 | + </TabItem> |
| 130 | +
|
| 131 | + <TabItem value="QCS9075"> |
| 132 | +
|
| 133 | + <NewCodeBlock tip="Device" type="device"> |
| 134 | + ```bash |
| 135 | + export PRODUCT_SOC=9075 DSP_ARCH=73 |
| 136 | + ``` |
| 137 | + </NewCodeBlock> |
| 138 | +
|
| 139 | +</TabItem> |
| 140 | +
|
| 141 | +</Tabs> |
| 142 | +
|
| 143 | +<NewCodeBlock tip="Device" type="device"> |
| 144 | +
|
| 145 | +```bash |
| 146 | +cd qairt/2.37.1.250807 |
| 147 | +source bin/envsetup.sh |
| 148 | +export ADSP_LIBRARY_PATH=$QNN_SDK_ROOT/lib/hexagon-v${DSP_ARCH}/unsigned |
| 149 | +``` |
| 150 | +
|
| 151 | +</NewCodeBlock> |
| 152 | +
|
| 153 | +### 下载 INT8 量化的 ONNX 模型 |
| 154 | +
|
| 155 | +在 [**Qualcomm AI Hub**](https://aihub.qualcomm.com/iot/models/resnet50?chipsets=qualcomm-qcs6490) 上下载一个 ONNX Runtime 格式的 w8a8 量化模型 |
| 156 | +
|
| 157 | +<div style={{ textAlign: "center" }}> |
| 158 | + <img |
| 159 | + src="/img/dragon/q6a/qaihub_onnxruntime_model.webp" |
| 160 | + style={{ width: "85%" }} |
| 161 | + /> |
| 162 | +</div> |
| 163 | +
|
| 164 | +### 测试 QNN Execution Provider |
| 165 | +
|
| 166 | +以下 Python 代码使用 QNN EP 创建 ONNX Runtime 会话,并使用 NPU 推理 w8a8 量化 ONNX 模型。代码参考 [Running a quantized model on Windows ARM64](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#running-a-quantized-model-on-windows-arm64-onnxruntime-qnn-version--1180) |
| 167 | +
|
| 168 | +<NewCodeBlock tip="Device" type="device"> |
| 169 | +
|
| 170 | +```bash |
| 171 | +vim run_qdq_model.py |
| 172 | +``` |
| 173 | +
|
| 174 | +</NewCodeBlock> |
| 175 | +
|
| 176 | +:::tip |
| 177 | +请提根据实际 QAIRT SDK 路径修改 `backend_path` 路径 |
| 178 | +::: |
| 179 | +
|
| 180 | +```python |
| 181 | +# run_qdq_model.py |
| 182 | + |
| 183 | +import onnxruntime |
| 184 | +import numpy as np |
| 185 | + |
| 186 | +options = onnxruntime.SessionOptions() |
| 187 | + |
| 188 | +# (Optional) Enable configuration that raises an exception if the model can't be |
| 189 | +# run entirely on the QNN HTP backend. |
| 190 | +options.add_session_config_entry("session.disable_cpu_ep_fallback", "1") |
| 191 | + |
| 192 | +# Create an ONNX Runtime session. |
| 193 | +# TODO: Provide the path to your ONNX model |
| 194 | +session = onnxruntime.InferenceSession("job_jpy6ye005_optimized_onnx/model.onnx", |
| 195 | + sess_options=options, |
| 196 | + providers=["QNNExecutionProvider"], |
| 197 | + provider_options=[{"backend_path": "libQnnHtp.so"}]) # Provide path to Htp dll in QNN SDK |
| 198 | + |
| 199 | +# Run the model with your input. |
| 200 | +# TODO: Use numpy to load your actual input from a file or generate random input. |
| 201 | +input0 = np.ones((1,3,224,224), dtype=np.uint8) |
| 202 | +result = session.run(None, {"image_tensor": input0}) |
| 203 | +
|
| 204 | +# Print output. |
| 205 | +print(result) |
| 206 | +``` |
| 207 | +
|
| 208 | +<NewCodeBlock tip="Device" type="device"> |
| 209 | +
|
| 210 | +```bash |
| 211 | +python3 run_qdq_model.py |
| 212 | +``` |
| 213 | +
|
| 214 | +</NewCodeBlock> |
| 215 | +
|
| 216 | +```bash |
| 217 | +(.venv) rock@radxa-dragon-q6a:~/ssd/qualcomm/onnxruntime/build/Linux/Release$ python3 run_qdq_model.py |
| 218 | +2025-12-22 06:31:37.527811909 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor" |
| 219 | +/prj/qct/webtech_scratch20/mlg_user_admin/qaisw_source_repo/rel/qairt-2.37.1/point_release/SNPE_SRC/avante-tools/prebuilt/dsp/hexagon-sdk-5.4.0/ipc/fastrpc/rpcmem/src/rpcmem_android.c:38:dummy call to rpcmem_init, rpcmem APIs will be used from libxdsprpc |
| 220 | +
|
| 221 | +====== DDR bandwidth summary ====== |
| 222 | +spill_bytes=0 |
| 223 | +fill_bytes=0 |
| 224 | +write_total_bytes=65536 |
| 225 | +read_total_bytes=25976832 |
| 226 | +
|
| 227 | +[array([[47, 55, 49, 46, 56, 55, 45, 47, 44, 49, 50, 46, 44, 46, 44, 47, |
| 228 | + 46, 46, 45, 45, 47, 49, 55, 48, 47, 49, 48, 49, 52, 54, 46, 61, |
| 229 | + 51, 48, 57, 41, 47, 41, 62, 50, 50, 44, 49, 48, 48, 50, 51, 52, |
| 230 | + 46, 43, 47, 51, 51, 52, 52, 54, 44, 40, 44, 65, 60, 49, 52, 57, |
| 231 | + 55, 50, 55, 47, 55, 57, 47, 62, 44, 62, 44, 51, 50, 53, 57, 57, |
| 232 | + 47, 51, 46, 39, 45, 44, 45, 46, 49, 52, 46, 42, 50, 48, 54, 42, |
| 233 | + 50, 36, 43, 47, 48, 44, 43, 54, 46, 41, 46, 63, 52, 46, 51, 75, |
| 234 | + 58, 58, 51, 49, 50, 64, 41, 44, 49, 43, 45, 47, 48, 50, 62, 50, |
| 235 | + 52, 52, 49, 44, 54, 41, 43, 46, 40, 42, 40, 41, 45, 46, 41, 46, |
| 236 | + 46, 47, 45, 49, 46, 50, 43, 51, 50, 52, 46, 47, 45, 43, 48, 46, |
| 237 | + 42, 48, 48, 52, 47, 47, 47, 47, 44, 45, 44, 47, 46, 49, 39, 45, |
| 238 | + 43, 45, 53, 44, 45, 47, 43, 44, 46, 48, 44, 51, 45, 48, 50, 46, |
| 239 | + 41, 44, 46, 52, 45, 38, 42, 44, 44, 41, 42, 51, 53, 37, 43, 48, |
| 240 | + 48, 44, 40, 43, 43, 44, 43, 46, 50, 45, 42, 46, 50, 48, 48, 50, |
| 241 | + 49, 42, 41, 41, 47, 45, 43, 46, 48, 47, 44, 46, 48, 45, 45, 48, |
| 242 | + 42, 45, 44, 42, 46, 46, 48, 45, 44, 43, 50, 49, 48, 45, 52, 36, |
| 243 | + 42, 47, 47, 46, 49, 42, 50, 43, 48, 47, 48, 43, 44, 48, 51, 47, |
| 244 | + 48, 43, 47, 45, 50, 55, 47, 50, 50, 53, 48, 57, 51, 58, 46, 46, |
| 245 | + 53, 48, 45, 48, 44, 50, 47, 43, 47, 48, 47, 53, 47, 54, 44, 53, |
| 246 | + 47, 45, 56, 58, 57, 46, 57, 51, 56, 55, 58, 58, 52, 55, 59, 53, |
| 247 | + 50, 42, 40, 46, 51, 44, 56, 51, 52, 42, 44, 50, 49, 48, 43, 45, |
| 248 | + 42, 45, 47, 42, 46, 46, 42, 39, 39, 47, 41, 45, 45, 46, 48, 47, |
| 249 | + 44, 47, 49, 46, 52, 45, 50, 50, 45, 52, 52, 49, 52, 47, 45, 50, |
| 250 | + 44, 44, 44, 45, 41, 45, 45, 44, 50, 50, 48, 41, 49, 45, 46, 46, |
| 251 | + 46, 47, 41, 45, 44, 52, 48, 43, 50, 45, 47, 50, 48, 52, 54, 64, |
| 252 | + 50, 62, 61, 48, 45, 52, 45, 45, 44, 64, 42, 47, 48, 60, 47, 43, |
| 253 | + 67, 54, 63, 63, 52, 60, 54, 55, 51, 50, 53, 55, 46, 61, 51, 45, |
| 254 | + 58, 53, 49, 57, 45, 57, 64, 53, 56, 60, 59, 52, 47, 51, 59, 55, |
| 255 | + 49, 46, 42, 60, 46, 51, 40, 54, 54, 61, 44, 56, 44, 55, 58, 55, |
| 256 | + 60, 60, 48, 44, 53, 58, 68, 50, 43, 63, 46, 54, 40, 52, 54, 60, |
| 257 | + 55, 62, 57, 49, 44, 58, 59, 62, 64, 46, 55, 57, 53, 49, 55, 46, |
| 258 | + 48, 54, 59, 68, 49, 56, 51, 61, 61, 52, 57, 61, 60, 39, 50, 44, |
| 259 | + 63, 64, 48, 57, 52, 57, 51, 52, 44, 46, 49, 56, 51, 43, 53, 60, |
| 260 | + 57, 55, 71, 62, 43, 47, 58, 52, 45, 41, 53, 59, 48, 56, 64, 57, |
| 261 | + 51, 54, 61, 41, 45, 59, 54, 59, 58, 54, 43, 44, 52, 56, 59, 55, |
| 262 | + 52, 48, 57, 60, 43, 45, 51, 57, 52, 46, 61, 48, 60, 48, 64, 42, |
| 263 | + 45, 57, 53, 59, 48, 48, 46, 62, 58, 60, 43, 61, 50, 49, 53, 55, |
| 264 | + 55, 64, 57, 43, 62, 51, 54, 56, 63, 53, 62, 39, 70, 61, 61, 64, |
| 265 | + 55, 45, 54, 51, 44, 56, 51, 55, 63, 54, 58, 67, 55, 46, 61, 63, |
| 266 | + 40, 41, 73, 50, 51, 66, 51, 57, 58, 61, 39, 59, 52, 49, 53, 43, |
| 267 | + 45, 62, 55, 64, 77, 44, 52, 55, 48, 51, 69, 54, 53, 55, 47, 46, |
| 268 | + 44, 51, 52, 51, 45, 39, 55, 49, 60, 45, 65, 53, 51, 41, 45, 46, |
| 269 | + 52, 60, 65, 42, 48, 65, 58, 59, 59, 60, 54, 58, 61, 60, 59, 48, |
| 270 | + 57, 48, 38, 54, 60, 50, 45, 60, 66, 49, 49, 62, 60, 52, 54, 49, |
| 271 | + 54, 41, 40, 53, 57, 53, 60, 68, 56, 57, 66, 47, 54, 41, 47, 59, |
| 272 | + 69, 43, 63, 52, 49, 60, 52, 51, 53, 50, 46, 62, 55, 56, 44, 49, |
| 273 | + 62, 59, 52, 51, 56, 53, 50, 53, 56, 59, 52, 58, 52, 64, 47, 49, |
| 274 | + 52, 57, 60, 54, 48, 39, 51, 58, 60, 66, 40, 61, 57, 50, 49, 65, |
| 275 | + 48, 66, 56, 53, 66, 60, 54, 48, 66, 56, 58, 46, 49, 53, 57, 63, |
| 276 | + 63, 57, 50, 52, 36, 60, 48, 51, 57, 52, 48, 50, 58, 49, 56, 54, |
| 277 | + 51, 46, 46, 44, 62, 48, 56, 47, 49, 54, 54, 49, 59, 61, 47, 48, |
| 278 | + 43, 47, 72, 57, 42, 49, 53, 57, 49, 47, 70, 57, 61, 43, 49, 54, |
| 279 | + 51, 47, 58, 48, 59, 62, 52, 56, 54, 54, 48, 48, 58, 70, 65, 45, |
| 280 | + 56, 55, 55, 61, 69, 44, 68, 64, 40, 55, 51, 53, 50, 57, 62, 53, |
| 281 | + 46, 36, 45, 51, 50, 51, 46, 45, 57, 55, 37, 61, 53, 52, 53, 57, |
| 282 | + 55, 60, 51, 64, 49, 56, 56, 48, 53, 60, 45, 47, 59, 58, 51, 47, |
| 283 | + 60, 53, 61, 57, 52, 61, 64, 57, 53, 62, 60, 58, 57, 50, 54, 48, |
| 284 | + 39, 47, 41, 41, 65, 47, 52, 57, 59, 50, 47, 47, 49, 47, 45, 52, |
| 285 | + 49, 56, 50, 47, 49, 46, 51, 48, 49, 53, 52, 39, 49, 42, 50, 50, |
| 286 | + 46, 55, 48, 46, 62, 58, 59, 59, 51, 51, 59, 45, 52, 54, 51, 49, |
| 287 | + 51, 48, 47, 48, 48, 48, 66, 60, 66, 57, 45, 61, 44, 40, 52, 48, |
| 288 | + 46, 43, 52, 43, 56, 52, 50, 52, 48, 61, 52, 52, 53, 45, 52, 45, |
| 289 | + 42, 40, 43, 44, 40, 41, 51, 61]], dtype=uint8)] |
| 290 | +/prj/qct/webtech_scratch20/mlg_user_admin/qaisw_source_repo/rel/qairt-2.37.1/point_release/SNPE_SRC/avante-tools/prebuilt/dsp/hexagon-sdk-5.4.0/ipc/fastrpc/rpcmem/src/rpcmem_android.c:42:dummy call to rpcmem_deinit, rpcmem APIs will be used from libxdsprpc |
| 291 | +``` |
| 292 | +
|
| 293 | +## 详细文档 |
| 294 | +
|
| 295 | +关于 QNNExecutionProvider 的详细使用方法请参考 |
| 296 | +
|
| 297 | +- [**QNN Execution Provider**](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#qnn-execution-provider) |
0 commit comments