radxa-docs · jack-ma · Jan 5, 2026 · Jan 5, 2026
@@ -1,4 +1,4 @@
-本文档讲述如何在瑞莎星睿 O6/O6N 上使用 Llama.cpp 启用 [KleidiAI](https://www.arm.com/markets/artificial-intelligence/software/kleidi) 加速推理百度文心一言 [ERNIE-4.5-0.3B](https://huggingface.co/baidu/ERNIE-4.5-0.3B-PT) 与 [ERNIE-4.5-0.3B-Base](https://huggingface.co/baidu/ERNIE-4.5-0.3B-Base-PT) 模型。
+本文档讲述如何在瑞莎星睿 O6 / O6N 上使用 llama.cpp 启用 [KleidiAI](https://www.arm.com/markets/artificial-intelligence/software/kleidi) 加速推理百度文心一言 [ERNIE-4.5-0.3B](https://huggingface.co/baidu/ERNIE-4.5-0.3B-PT) 与 [ERNIE-4.5-0.3B-Base](https://huggingface.co/baidu/ERNIE-4.5-0.3B-Base-PT) 模型。
 
 模型地址：
 
@@ -47,12 +47,12 @@ radxa 提供预编译好的 [ERNIE-4.5-0.3B-PT-Q4_0.gguf](https://modelscope.cn/
 如不想进行模型转换可以下载 radxa 提供的 GGUF 模型然后跳到 [**模型推理**](#模型推理)
 :::
 
-### 编译 Llama.cpp
+### 编译 llama.cpp
 
-在 X86 主机上编译 Llama.cpp
+在 X86 主机上编译 llama.cpp
 
 :::tip
-请根据 [**Llama.cpp**](./llama_cpp) 在 X86 主机上编译带 Llama.cpp
+请根据 [**llama.cpp**](../../orion/o6/app-development/artificial-intelligence/llama_cpp.md) 在 X86 主机上编译带 llama.cpp
 :::
 
 以下为编译命令
@@ -168,10 +168,10 @@ cmake --build build --config Release
 
 ## 模型推理
 
-### 编译 Llama.cpp
+### 编译 llama.cpp
 
 :::tip
-请根据 [**Llama.cpp**](./llama_cpp) 在瑞莎星睿 O6/O6N 上编译带 **KleidiAI** 特性的 Llama.cpp
+请根据 [**llama.cpp**](../../orion/o6/app-development/artificial-intelligence/llama_cpp.md) 在瑞莎星睿 O6/O6N 上编译带 **KleidiAI** 特性的 llama.cpp
 :::
 
 以下为编译命令

@@ -1,4 +1,4 @@
-本文档讲述如何在瑞莎星睿 O6/O6N 上使用 Llama.cpp 启用 [KleidiAI](https://www.arm.com/markets/artificial-intelligence/software/kleidi) 加速推理百度文心一言 [ERNIE-4.5-21B-A3B](https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-PT) 与 [ERNIE-4.5-21B-A3B-Base](https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Base-PT) 模型。
+本文档讲述如何在瑞莎星睿 O6 / O6N 上使用 llama.cpp 启用 [KleidiAI](https://www.arm.com/markets/artificial-intelligence/software/kleidi) 加速推理百度文心一言 [ERNIE-4.5-21B-A3B](https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-PT) 与 [ERNIE-4.5-21B-A3B-Base](https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Base-PT) 模型。
 
 模型地址：
 
@@ -47,12 +47,12 @@ radxa 提供预编译好的 [ERNIE-4.5-21B-A3B-PT-Q4_0.gguf](https://modelscope.
 如不想进行模型转换可以下载 radxa 提供的 GGUF 模型然后跳到 [**模型推理**](#模型推理)
 :::
 
-### 编译 Llama.cpp
+### 编译 llama.cpp
 
-在 X86 主机上编译 Llama.cpp
+在 X86 主机上编译 llama.cpp
 
 :::tip
-请根据 [**Llama.cpp**](./llama_cpp) 在 X86 主机上编译带 Llama.cpp
+请根据 [**llama.cpp**](../../orion/o6/app-development/artificial-intelligence/llama_cpp.md) 在 X86 主机上编译带 llama.cpp
 :::
 
 以下为编译命令
@@ -94,6 +94,7 @@ cmake --build build --config Release
     pip3 install modelscope
     modelscope download --model PaddlePaddle/ERNIE-4.5-21B-A3B-Base-PT --local_dir ./ERNIE-4.5-21B-A3B-Base-PT
     ```
+
     </NewCodeBlock>
 
     </TabItem>
@@ -124,6 +125,7 @@ cmake --build build --config Release
     cd llama.cpp
     python3 convert_hf_to_gguf.py ./ERNIE-4.5-21B-A3B-Base-PT
     ```
+
     </NewCodeBlock>
 
     </TabItem>
@@ -168,10 +170,10 @@ cmake --build build --config Release
 
 ## 模型推理
 
-### 编译 Llama.cpp
+### 编译 llama.cpp
 
 :::tip
-请根据 [**Llama.cpp**](./llama_cpp) 在瑞莎星睿 O6/O6N 上编译带 **KleidiAI** 特性的 Llama.cpp
+请根据 [**llama.cpp**](../../orion/o6/app-development/artificial-intelligence/llama_cpp.md) 在瑞莎星睿 O6/O6N 上编译带 **KleidiAI** 特性的 llama.cpp
 :::
 
 以下为编译命令

@@ -1,86 +1,117 @@
-llama.cpp 的主要目标是在各种硬件上（本地和云端）以最少的设置和最优化的性能实现 LLM 推理。
+llama.cpp 的核心目标是：通过极简的配置，在从本地到云端的各类硬件上，实现极致性能的大语言模型（LLM）推理。
+
+:::note[什么是llama.cpp？]
+llama.cpp 是一个基于纯 C/C++ 实现的高性能大模型推理框架，它摒弃了繁琐的外部库依赖，支持在 CPU 和 GPU 上进行高效计算。通过其首创的 GGUF 格式与量化技术，它让原本臃肿的大模型能够在普通的个人电脑、Mac 甚至手机等消费级设备上流畅运行。
+:::
+
+本文档将指引您快速上手 llama.cpp，带您高效完成环境搭建与模型运行。
 
 ## 克隆仓库
 
+<NewCodeBlock tip="Device" type="device">
+
 ```bash
 git clone https://github.com/ggml-org/llama.cpp.git
 ```
 
+</NewCodeBlock>
+
 ## 编译 llama.cpp
 
 ### 安装编译工具
 
+<NewCodeBlock tip="Device" type="device">
+
 ```bash
 sudo apt install cmake gcc g++
 ```
 
-### 编译项目
+</NewCodeBlock>
+
+### 进行编译
+
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 cmake -B build
-cmake --build build --config Release
+cmake --build build --config Release -j$(nproc)
 ```
 
-:::tip
-如果您使用的是 [瑞莎星睿 O6 / O6N](/orion/o6) 搭载 ARM-v9 CPU，可添加 `armv9-a` 与 `KleidiAI` 编译选项进行硬件级优化
+</NewCodeBlock>
+
+:::info[KleidiAI]
+对于采用 ARM-v9 架构的 [瑞莎星睿 O6 / O6N](/orion/o6) 设备，可以开启 armv9-a 和 KleidiAI 编译选项进行硬件级优化。
+:::
+
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 cmake -B build -DGGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=armv9-a+i8mm+dotprod -DGGML_CPU_KLEIDIAI=ON
 cmake --build build --config Release
 ```
 
+</NewCodeBlock>
+
+:::tip[硬件优化]
+Llama.cpp 已集成 Arm KleidiAI 库，该库针对 SME、I8MM 及点积加速等硬件特性，提供了深度优化的矩阵乘法内核。您可以通过构建选项 GGML_CPU_KLEIDIAI=ON 来启用此功能。
 :::
 
-:::tip
-Llama.cpp 集成了 Arm 的 KleidiAI 库，该库提供了针对 sme、i8mm 和点积加速等硬件功能优化的矩阵乘法内核。
-可以使用构建选项 `GGML_CPU_KLEIDIAI` 启用该功能。
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 cmake -B build -DGGML_CPU_KLEIDIAI=ON
 cmake --build build --config Release
 ```
 
-:::
-
-## 使用方法
+</NewCodeBlock>
 
-### GGUF 模型转换
+## 快速上手
 
 :::tip
-这里以 [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) 为例子
+推荐使用 python3.11 以上版本
 :::
 
-#### 下载 Huggingface 模型
+接下来的操作步骤中使用的示例模型为 [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) 。
 
-请使用 [git LFS](https://git-lfs.com/) 克隆仓库
+### 下载示例模型
+
+请使用 [git LFS](https://git-lfs.com/) 。
+
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 git lfs install
 git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
 ```
 
-#### 生成 GGUF 模型
+</NewCodeBlock>
 
-:::tip
-推荐使用 python3.11 以上版本
-:::
+### 模型转换
+
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 cd llama.cpp
 pip3 install -r ./requirements.txt
 python3 convert_hf_to_gguf.py DeepSeek-R1-Distill-Qwen-1.5B/
 ```
 
-#### 量化模型
+</NewCodeBlock>
+
+### 模型量化
+
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 cd build/bin
 ./llama-quantize DeepSeek-R1-Distill-Qwen-1.5B-F16.gguf DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf Q4_K_M
 ```
 
-量化可选类型
+</NewCodeBlock>
 
-```bash
+可选的量化选项：
+
+```txt
    2  or  Q4_0    :  4.34G, +0.4685 ppl @ Llama-3-8B
    3  or  Q4_1    :  4.78G, +0.4511 ppl @ Llama-3-8B
    8  or  Q5_0    :  5.21G, +0.1316 ppl @ Llama-3-8B
@@ -119,14 +150,20 @@ cd build/bin
           COPY    : only copy tensors, no quantizing
 ```
 
-### 运行 GGUF 模型
+### 模型验证
+
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 cd build/bin
 ./llama-cli -m DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
 ```
 
-```bash
+</NewCodeBlock>
+
+模型运行效果：
+
+```txt
 > hi, who are you
 <think>
 
@@ -135,13 +172,17 @@ cd build/bin
 Hi! I'm DeepSeek-R1, an artificial intelligence assistant created by DeepSeek. I'm at your service and would be delighted to assist you with any inquiries or tasks you may have.
 ```
 
-### GGUF 基准测试
+### 模型测试
+
+<NewCodeBlock tip="Device" type="device">
 
 ```bash
 ./llama-bench -m DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
 ```
 
-```bash
+</NewCodeBlock>
+
+```txt
 radxa@orion-o6:~/llama.cpp/build/bin$ ./llama-bench -m ~/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf -t 8
 | model                          |       size |     params | backend    | threads |          test |                  t/s |
 | ------------------------------ | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |

@@ -1,5 +1,5 @@
-Ollama 是一个用于在本地运行和管理大语言模型（LLM）的工具。
-它可以让你在本地设备上轻松拉取、运行和管理各种 AI 模型，比如 LLaMA、Mistral、Gemma 等，无需复杂的环境配置。
+Ollama 是一款高效的本地大语言模型（LLM）管理与运行工具。
+它极大地简化了 AI 模型的部署流程，让用户无需复杂的环境配置，即可在本地设备上实现模型的一键拉取、运行与统一管理。
 
 ## Ollama 安装
 
@@ -13,15 +13,15 @@ curl -fsSL https://ollama.com/install.sh | sh
 
 ### 拉取模型
 
-此命令会通过互联网下载模型文件
+此命令会通过互联网下载模型文件。
 
 ```bash
 ollama pull deepseek-r1:1.5b
 ```
 
 ### 运行模型
 
-此命令会直接运行模型，如本地没有模型缓存会自动通过互联网下载模型文件并运行
+此命令会直接启动模型，若无本地缓存，则自动联网下载并运行。
 
 ```bash
 ollama run deepseek-r1:1.5b

@@ -0,0 +1,5 @@
+NPU 驱动对上层开放的 API 接口分为 C++ 和 Python 两部分。
+
+详细的 API 手册可在[此芯开发者中心](https://developer.cixtech.com/)下载。
+
+下拉找到文档资源，点击 AI 一栏的下载即可。
@@ -0,0 +1,4 @@
+BEV_RoadSeg 是一个专注于自动驾驶可行驶空间感知的专用系统。它创新性地融合鸟瞰图变换与 Transformer 架构，通过 LSTR 深度学习模型对道路结构进行精准分割，从而在复杂的动态行车环境中实现稳定、可靠的车道与可行驶区域识别。
+
+- 核心功能：基于多摄像头环视输入，生成鸟瞰视角下的高精度可行驶区域与车道线分割图，为路径规划提供关键感知依据。
+- 技术特点：采用 LSTR 模型作为核心，利用 Transformer 对长距离空间关系的强大建模能力，有效应对弯道、岔路口及部分遮挡等挑战性场景。