Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
303 changes: 303 additions & 0 deletions docs/common/ai/_qnn_onnxrt_execution_provider.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,303 @@
ONNX Runtime 的 **[QNN Execution Provider](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html)** 可在高通 SoC 平台上启用 **NPU 硬件加速推理 ONNX 格式模型**。
它使用 **[Qualcomm® AI Runtime (QAIRT SDK)](./qairt-sdk#qairt)**,将 **ONNX 模型** 构建为一个 **QNN 计算图**,并通过 **加速器后端库** 来执行该计算图。
ONNX Runtime 的 **QNN Execution Provider** 可用于搭载 **高通平台 SoC** 的 **Linux**, **Android**, **Windows** 设备。

## 支持设备

- [**瑞莎 Dragon Q6A**](/dragon/q6a/) (Linux)

- [**瑞莎 Fogwise AIRbox Q900**](/fogwise/airbox-q900) (Linux)

## 安装方法

:::tip
安装方式有两种,可选择 **pip 安装** 与 **源码编译安装**

无论选择哪种方式都需要按照 [**QAIRT SDK 安装**](./qairt-install) 下载 QAIRT SDK
:::

### 创建 python 虚拟环境

<NewCodeBlock tip="Device" type="device">

```bash
sudo apt install python3-venv
python3 -m venv .venv
source .venv/bin/activate
pip3 install --upgrade pip
```

</NewCodeBlock>

### pip 安装

radxa 已经提供预编译 Linux 版本的 `onnxruntime-qnn whl` 文件

<NewCodeBlock tip="Device" type="device">

```bash
pip3 install https://github.com/ZIFENG278/onnxruntime/releases/download/v1.23.2/onnxruntime_qnn-1.23.2-cp312-cp312-linux_aarch64.whl
```

</NewCodeBlock>

### 源码编译

#### 克隆 onnxruntime 仓库

<NewCodeBlock tip="Device" type="device">

```bash
git clone --depth 1 -b v1.23.2 https://github.com/microsoft/onnxruntime.git
```

</NewCodeBlock>

#### 修改 CMakeLists.txt

因为 onnxruntime 不直接支持 Linux 系统,若编译为支持 Linux 系统的 onnxruntime-qnn whl 包,需要手动更改 `cmake/CMakeLists.txt` 的 840 行。

将 L840 `set(QNN_ARCH_ABI aarch64-android)` 修改为 `set(QNN_ARCH_ABI aarch64-oe-linux-gcc11.2)`

:::tip
若编译为 **Android** 或 **Windows** 则不需要修改。
:::

<NewCodeBlock tip="Device" type="device">

```bash
cd onnxruntime
vim cmake/CMakeLists.txt
```

</NewCodeBlock>
```bash
diff --git a/cmake/CMakeLists.txt b/cmake/CMakeLists.txt
index 0b37ade..f4621e5 100644
--- a/cmake/CMakeLists.txt
+++ b/cmake/CMakeLists.txt
@@ -837,7 +837,7 @@ if (onnxruntime_USE_QNN OR onnxruntime_USE_QNN_INTERFACE)
if (${GEN_PLATFORM} STREQUAL "x86_64")
set(QNN_ARCH_ABI x86_64-linux-clang)
else()
- set(QNN_ARCH_ABI aarch64-android)
+ set(QNN_ARCH_ABI aarch64-oe-linux-gcc11.2)
endif()
endif()
endif()
```

#### 编译项目

:::tip
请根据实际 QAIRT SDK 路径修改 `QNN_SDK_PATH` 路径
:::

<NewCodeBlock tip="Device" type="device">

```bash
pip3 install -r requirements.txt
./build.sh --use_qnn --qnn_home [QNN_SDK_PATH] --build_shared_lib --build_wheel --config Release --parallel --skip_tests --build_dir build/Linux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build.sh command contains --qnn_home [QNN_SDK_PATH] but doesn't explain what value should replace [QNN_SDK_PATH]. Consider adding a note explaining this should be replaced with the path to the QAIRT SDK installation directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tip. done

```

</NewCodeBlock>

项目编译完成后,目标 whl 包生成在 `build/Linux/Release/dist` 下

<NewCodeBlock tip="Device" type="device">

```bash
pip3 install ./build/Linux/Release/dist/onnxruntime_qnn-1.23.2-cp312-cp312-linux_aarch64.whl
```

</NewCodeBlock>

## 验证 QNN Execution Provider

:::tip
验证 QNN Execution Provider 前,请先按照 [**板端启用 NPU**](./fastrpc_setup) 与 [**NPU 快速验证**](./quick-example) 验证 NPU 可用性,确保 NPU 功能正常后再测试 QNN Execution Provider。
:::

### 导入环境变量

<Tabs>
<TabItem value="QCS6490">

<NewCodeBlock tip="Device" type="device">
```bash
export PRODUCT_SOC=6490 DSP_ARCH=68
```
</NewCodeBlock>


</TabItem>

<TabItem value="QCS9075">

<NewCodeBlock tip="Device" type="device">
```bash
export PRODUCT_SOC=9075 DSP_ARCH=73
```
</NewCodeBlock>

</TabItem>

</Tabs>

<NewCodeBlock tip="Device" type="device">

```bash
cd qairt/2.37.1.250807
source bin/envsetup.sh
export ADSP_LIBRARY_PATH=$QNN_SDK_ROOT/lib/hexagon-v${DSP_ARCH}/unsigned
```

</NewCodeBlock>

### 下载 INT8 量化的 ONNX 模型

在 [**Qualcomm AI Hub**](https://aihub.qualcomm.com/iot/models/resnet50?chipsets=qualcomm-qcs6490) 上下载一个 ONNX Runtime 格式的 w8a8 量化模型

<div style={{ textAlign: "center" }}>
<img
src="/img/dragon/q6a/qaihub_onnxruntime_model.webp"
style={{ width: "85%" }}
/>
</div>

### 测试 QNN Execution Provider

以下 Python 代码使用 QNN EP 创建 ONNX Runtime 会话,并使用 NPU 推理 w8a8 量化 ONNX 模型。代码参考 [Running a quantized model on Windows ARM64](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#running-a-quantized-model-on-windows-arm64-onnxruntime-qnn-version--1180)

<NewCodeBlock tip="Device" type="device">

```bash
vim run_qdq_model.py
```

</NewCodeBlock>

:::tip
请根据实际 QAIRT SDK 路径修改 `backend_path` 路径

请根据下载的 onnx 模型路径修改 `InferenceSession` 中的模型路径参数
:::

```python
# run_qdq_model.py

import onnxruntime
import numpy as np

options = onnxruntime.SessionOptions()

# (Optional) Enable configuration that raises an exception if the model can't be
# run entirely on the QNN HTP backend.
options.add_session_config_entry("session.disable_cpu_ep_fallback", "1")

# Create an ONNX Runtime session.
# TODO: Provide the path to your ONNX model
session = onnxruntime.InferenceSession("job_jpy6ye005_optimized_onnx/model.onnx",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example code uses a hardcoded model path job_jpy6ye005_optimized_onnx/model.onnx which won't match what users download. Consider replacing this with a more generic path and adding a comment to remind users to update it with their actual downloaded model path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tip. done

sess_options=options,
providers=["QNNExecutionProvider"],
provider_options=[{"backend_path": "libQnnHtp.so"}]) # Provide path to Htp dll in QNN SDK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend_path parameter is set to just "libQnnHtp.so" without a full path. This will only work if the library is in the system's library search path. Consider updating this to use a full path like "$QNN_SDK_ROOT/lib/aarch64-oe-linux-gcc11.2/libQnnHtp.so" or adding a note explaining how to specify the correct full path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tip, done


# Run the model with your input.
# TODO: Use numpy to load your actual input from a file or generate random input.
input0 = np.ones((1,3,224,224), dtype=np.uint8)
result = session.run(None, {"image_tensor": input0})

# Print output.
print(result)
```

<NewCodeBlock tip="Device" type="device">

```bash
python3 run_qdq_model.py
```

</NewCodeBlock>

```bash
(.venv) rock@radxa-dragon-q6a:~/ssd/qualcomm/onnxruntime/build/Linux/Release$ python3 run_qdq_model.py
2025-12-22 06:31:37.527811909 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor"
/prj/qct/webtech_scratch20/mlg_user_admin/qaisw_source_repo/rel/qairt-2.37.1/point_release/SNPE_SRC/avante-tools/prebuilt/dsp/hexagon-sdk-5.4.0/ipc/fastrpc/rpcmem/src/rpcmem_android.c:38:dummy call to rpcmem_init, rpcmem APIs will be used from libxdsprpc

====== DDR bandwidth summary ======
spill_bytes=0
fill_bytes=0
write_total_bytes=65536
read_total_bytes=25976832

[array([[47, 55, 49, 46, 56, 55, 45, 47, 44, 49, 50, 46, 44, 46, 44, 47,
46, 46, 45, 45, 47, 49, 55, 48, 47, 49, 48, 49, 52, 54, 46, 61,
51, 48, 57, 41, 47, 41, 62, 50, 50, 44, 49, 48, 48, 50, 51, 52,
46, 43, 47, 51, 51, 52, 52, 54, 44, 40, 44, 65, 60, 49, 52, 57,
55, 50, 55, 47, 55, 57, 47, 62, 44, 62, 44, 51, 50, 53, 57, 57,
47, 51, 46, 39, 45, 44, 45, 46, 49, 52, 46, 42, 50, 48, 54, 42,
50, 36, 43, 47, 48, 44, 43, 54, 46, 41, 46, 63, 52, 46, 51, 75,
58, 58, 51, 49, 50, 64, 41, 44, 49, 43, 45, 47, 48, 50, 62, 50,
52, 52, 49, 44, 54, 41, 43, 46, 40, 42, 40, 41, 45, 46, 41, 46,
46, 47, 45, 49, 46, 50, 43, 51, 50, 52, 46, 47, 45, 43, 48, 46,
42, 48, 48, 52, 47, 47, 47, 47, 44, 45, 44, 47, 46, 49, 39, 45,
43, 45, 53, 44, 45, 47, 43, 44, 46, 48, 44, 51, 45, 48, 50, 46,
41, 44, 46, 52, 45, 38, 42, 44, 44, 41, 42, 51, 53, 37, 43, 48,
48, 44, 40, 43, 43, 44, 43, 46, 50, 45, 42, 46, 50, 48, 48, 50,
49, 42, 41, 41, 47, 45, 43, 46, 48, 47, 44, 46, 48, 45, 45, 48,
42, 45, 44, 42, 46, 46, 48, 45, 44, 43, 50, 49, 48, 45, 52, 36,
42, 47, 47, 46, 49, 42, 50, 43, 48, 47, 48, 43, 44, 48, 51, 47,
48, 43, 47, 45, 50, 55, 47, 50, 50, 53, 48, 57, 51, 58, 46, 46,
53, 48, 45, 48, 44, 50, 47, 43, 47, 48, 47, 53, 47, 54, 44, 53,
47, 45, 56, 58, 57, 46, 57, 51, 56, 55, 58, 58, 52, 55, 59, 53,
50, 42, 40, 46, 51, 44, 56, 51, 52, 42, 44, 50, 49, 48, 43, 45,
42, 45, 47, 42, 46, 46, 42, 39, 39, 47, 41, 45, 45, 46, 48, 47,
44, 47, 49, 46, 52, 45, 50, 50, 45, 52, 52, 49, 52, 47, 45, 50,
44, 44, 44, 45, 41, 45, 45, 44, 50, 50, 48, 41, 49, 45, 46, 46,
46, 47, 41, 45, 44, 52, 48, 43, 50, 45, 47, 50, 48, 52, 54, 64,
50, 62, 61, 48, 45, 52, 45, 45, 44, 64, 42, 47, 48, 60, 47, 43,
67, 54, 63, 63, 52, 60, 54, 55, 51, 50, 53, 55, 46, 61, 51, 45,
58, 53, 49, 57, 45, 57, 64, 53, 56, 60, 59, 52, 47, 51, 59, 55,
49, 46, 42, 60, 46, 51, 40, 54, 54, 61, 44, 56, 44, 55, 58, 55,
60, 60, 48, 44, 53, 58, 68, 50, 43, 63, 46, 54, 40, 52, 54, 60,
55, 62, 57, 49, 44, 58, 59, 62, 64, 46, 55, 57, 53, 49, 55, 46,
48, 54, 59, 68, 49, 56, 51, 61, 61, 52, 57, 61, 60, 39, 50, 44,
63, 64, 48, 57, 52, 57, 51, 52, 44, 46, 49, 56, 51, 43, 53, 60,
57, 55, 71, 62, 43, 47, 58, 52, 45, 41, 53, 59, 48, 56, 64, 57,
51, 54, 61, 41, 45, 59, 54, 59, 58, 54, 43, 44, 52, 56, 59, 55,
52, 48, 57, 60, 43, 45, 51, 57, 52, 46, 61, 48, 60, 48, 64, 42,
45, 57, 53, 59, 48, 48, 46, 62, 58, 60, 43, 61, 50, 49, 53, 55,
55, 64, 57, 43, 62, 51, 54, 56, 63, 53, 62, 39, 70, 61, 61, 64,
55, 45, 54, 51, 44, 56, 51, 55, 63, 54, 58, 67, 55, 46, 61, 63,
40, 41, 73, 50, 51, 66, 51, 57, 58, 61, 39, 59, 52, 49, 53, 43,
45, 62, 55, 64, 77, 44, 52, 55, 48, 51, 69, 54, 53, 55, 47, 46,
44, 51, 52, 51, 45, 39, 55, 49, 60, 45, 65, 53, 51, 41, 45, 46,
52, 60, 65, 42, 48, 65, 58, 59, 59, 60, 54, 58, 61, 60, 59, 48,
57, 48, 38, 54, 60, 50, 45, 60, 66, 49, 49, 62, 60, 52, 54, 49,
54, 41, 40, 53, 57, 53, 60, 68, 56, 57, 66, 47, 54, 41, 47, 59,
69, 43, 63, 52, 49, 60, 52, 51, 53, 50, 46, 62, 55, 56, 44, 49,
62, 59, 52, 51, 56, 53, 50, 53, 56, 59, 52, 58, 52, 64, 47, 49,
52, 57, 60, 54, 48, 39, 51, 58, 60, 66, 40, 61, 57, 50, 49, 65,
48, 66, 56, 53, 66, 60, 54, 48, 66, 56, 58, 46, 49, 53, 57, 63,
63, 57, 50, 52, 36, 60, 48, 51, 57, 52, 48, 50, 58, 49, 56, 54,
51, 46, 46, 44, 62, 48, 56, 47, 49, 54, 54, 49, 59, 61, 47, 48,
43, 47, 72, 57, 42, 49, 53, 57, 49, 47, 70, 57, 61, 43, 49, 54,
51, 47, 58, 48, 59, 62, 52, 56, 54, 54, 48, 48, 58, 70, 65, 45,
56, 55, 55, 61, 69, 44, 68, 64, 40, 55, 51, 53, 50, 57, 62, 53,
46, 36, 45, 51, 50, 51, 46, 45, 57, 55, 37, 61, 53, 52, 53, 57,
55, 60, 51, 64, 49, 56, 56, 48, 53, 60, 45, 47, 59, 58, 51, 47,
60, 53, 61, 57, 52, 61, 64, 57, 53, 62, 60, 58, 57, 50, 54, 48,
39, 47, 41, 41, 65, 47, 52, 57, 59, 50, 47, 47, 49, 47, 45, 52,
49, 56, 50, 47, 49, 46, 51, 48, 49, 53, 52, 39, 49, 42, 50, 50,
46, 55, 48, 46, 62, 58, 59, 59, 51, 51, 59, 45, 52, 54, 51, 49,
51, 48, 47, 48, 48, 48, 66, 60, 66, 57, 45, 61, 44, 40, 52, 48,
46, 43, 52, 43, 56, 52, 50, 52, 48, 61, 52, 52, 53, 45, 52, 45,
42, 40, 43, 44, 40, 41, 51, 61]], dtype=uint8)]
/prj/qct/webtech_scratch20/mlg_user_admin/qaisw_source_repo/rel/qairt-2.37.1/point_release/SNPE_SRC/avante-tools/prebuilt/dsp/hexagon-sdk-5.4.0/ipc/fastrpc/rpcmem/src/rpcmem_android.c:42:dummy call to rpcmem_deinit, rpcmem APIs will be used from libxdsprpc
```

## 详细文档

关于 QNNExecutionProvider 的详细使用方法请参考

- [**QNN Execution Provider**](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#qnn-execution-provider)
2 changes: 1 addition & 1 deletion docs/dragon/q6a/app-dev/npu-dev/llama3.2-1b-qairt-v68.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 10
sidebar_position: 11
---

# Llama3.2-1B 大模型
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 9
sidebar_position: 99
---

# Demos 示例
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
sidebar_position: 10
---

# QNN Execution Provider

import QNNONNXRTEXECUTIONPROVIDER from '../../../../common/ai/\_qnn_onnxrt_execution_provider.mdx';

<QNNONNXRTEXECUTIONPROVIDER />
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 9
sidebar_position: 99
---

# Demos 示例
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
sidebar_position: 10
---

# QNN Execution Provider

import QNNONNXRTEXECUTIONPROVIDER from '../../../common/ai/\_qnn_onnxrt_execution_provider.mdx';

<QNNONNXRTEXECUTIONPROVIDER />
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.