flagos-ai · github-actions · Apr 12, 2026
diff --git a/docs/flagrelease_en/model_list.txt b/docs/flagrelease_en/model_list.txt
@@ -27,6 +27,12 @@ FlagRelease/MiniCPM_o_2.6-FlagOS-Cambricon
 FlagRelease/MiniCPM_o_2.6-FlagOS-NVIDIA
 FlagRelease/MiniMax-M1-80k-FlagOS
 FlagRelease/MiniMax-M2-FlagOS
+FlagRelease/MiniMax-M2.7-ascend-FlagOS
+FlagRelease/MiniMax-M2.7-hygon-FlagOS
+FlagRelease/MiniMax-M2.7-iluvatar-FlagOS
+FlagRelease/MiniMax-M2.7-metax-FlagOS
+FlagRelease/MiniMax-M2.7-nvidia-FlagOS
+FlagRelease/MiniMax-M2.7-zhenwu-FlagOS
 FlagRelease/QwQ-32B-FlagOS-Cambricon
 FlagRelease/QwQ-32B-FlagOS-Iluvatar
 FlagRelease/QwQ-32B-FlagOS-Nvidia

diff --git a/docs/flagrelease_en/model_readmes/FlagRelease_GLM-5-FP8-FlagOS.md b/docs/flagrelease_en/model_readmes/FlagRelease_GLM-5-FP8-FlagOS.md
@@ -26,7 +26,7 @@ Environment Setup
 
 ### Download FlagOS Image
 ```bash
-docker pull harbor.baai.ac.cn/flagrelease-public/flagrelease-nvidia-release-model_glm-5-fp8-tree_0.4.1_3.5-gems_4.2.1rc0-scale_none-cx_none-python_3.12.3-torch_2.9.0-pcp_cuda13.1-gpu_nvidia003-arc_amd64-driver_570.158.01:202604021459
+docker pull harbor.baai.ac.cn/flagrelease-public/flagrelease-nvidia-release-model_glm-5-fp8-tree_0.4.1_3.5-gems_4.2.1rc0-scale_none-cx_none-python_3.12.3-torch_2.9.0-pcp_cuda13.1-gpu_nvidia003-arc_amd64-driver_570.158.01:20260407170936
 ```
 
 ### Download Open-source Model Weights
@@ -37,7 +37,7 @@ modelscope download --model FlagRelease/GLM-5-FP8-FlagOS --local_dir /data/GLM-5
 
 ### Start the Container
 ```bash
-docker run --rm --init --detach --net=host --uts=host --ipc=host --security-opt=seccomp=unconfined --privileged=true --ulimit stack=67108864 --ulimit memlock=-1 --ulimit nofile=1048576:1048576 --shm-size=32G -v /data:/data --gpus all --name flagos harbor.baai.ac.cn/flagrelease-public/flagrelease-nvidia-release-model_glm-5-fp8-tree_0.4.1_3.5-gems_4.2.1rc0-scale_none-cx_none-python_3.12.3-torch_2.9.0-pcp_cuda13.1-gpu_nvidia003-arc_amd64-driver_570.158.01:202604021459  sleep infinity
+docker run --rm --init --detach --net=host --uts=host --ipc=host --security-opt=seccomp=unconfined --privileged=true --ulimit stack=67108864 --ulimit memlock=-1 --ulimit nofile=1048576:1048576 --shm-size=32G -v /data:/data --gpus all --name flagos harbor.baai.ac.cn/flagrelease-public/flagrelease-nvidia-release-model_glm-5-fp8-tree_0.4.1_3.5-gems_4.2.1rc0-scale_none-cx_none-python_3.12.3-torch_2.9.0-pcp_cuda13.1-gpu_nvidia003-arc_amd64-driver_570.158.01:20260407170936  sleep infinity
 docker exec -it flagos /bin/bash
 ```
 ### Start the Server
@@ -116,4 +116,4 @@ We warmly welcome global developers to join us:
 3. Improve technical documentation
 4. Expand hardware adaptation support
 # License
-本模型的权重来源于ZhipuAI/GLM-5-FP8，以apache2.0协议开源: https://www.apache.org/licenses/LICENSE-2.0.txt。
+本模型的权重来源于ZhipuAI/GLM-5-FP8，以apache2.0协议开源: https://www.apache.org/licenses/LICENSE-2.0.txt。
diff --git a/docs/flagrelease_en/model_readmes/FlagRelease_MiniMax-M2.7-ascend-FlagOS.md b/docs/flagrelease_en/model_readmes/FlagRelease_MiniMax-M2.7-ascend-FlagOS.md
@@ -0,0 +1,139 @@
+---
+base_model:
+- ""
+---
+# Introduction
+MiniMax M2.7 is the latest-generation model in the M2 series, as well as the first model in the series to deeply participate in its own iteration. It can autonomously build complex Agent Harnesses and Skills, update its own Memory, and drive self-iteration through reinforcement learning, forming a closed loop of "model-driven model evolution".
+In terms of capabilities, M2.7 covers the entire software engineering workflow from code generation and log troubleshooting to end-to-end project delivery, achieving a score of 56.22% on the SWE-Pro benchmark, on par with GPT-5.3-Codex. It also delivers strong performance in professional office scenarios, ranking behind only Opus4.6, Sonnet4.6 and GPT-5.4 on the GDPval-AA metric, while maintaining a 97% instruction-following rate across 40 complex Skills scenarios involving more than 2000 tokens.
+### Integrated Deployment
+- Out-of-the-box inference scripts with pre-configured hardware and software parameters	
+- Released **FlagOS-Ascend** container image supporting deployment within minutes
+### Consistency Validation
+- Rigorously evaluated through benchmark testing: Performance and results from the FlagOS software stack are compared against native stacks on multiple public.	
+
+# Evaluation Results
+## Benchmark Result
+|Metrics|MiniMax-M2.7-Nvidia-Origin|MiniMax-M2.7-Ascend-FlagOS|
+|-------|---------------|---------------|
+|GPQA_Diamond | 0.7071| 0.6263 |
+|Aime24  | 0.9 | 0.8333 |
+
+# User Guide
+Environment Setup
+
+| Item             | Version              |
+|------------------|----------------------|
+| Docker Version   | 20.10.8, build 3967b7d |
+| Operating System | Linux 5.10.0-216.0.0.115.oe2203sp4.aarch64 |
+
+## Operation Steps
+
+### Download FlagOS Image
+```bash
+docker pull harbor.baai.ac.cn/flagrelease-public/flagrelease-ascend-minimax
+```
+
+### Download Open-source Model Weights
+```bash
+pip install modelscope
+modelscope download --model FlagRelease/MiniMax-M2.7-ascend-FlagOS --local_dir /public-nfs/MiniMax-M2.7
+```
+
+### Start the Container
+```bash
+docker run --rm --name flagos --net=host --shm-size=1g   --device /dev/davinci0 --device /dev/davinci1 --device /dev/davinci2 --device /dev/davinci3   --device /dev/davinci4 --device /dev/davinci5 --device /dev/davinci6 --device /dev/davinci7   --device /dev/davinci8 --device /dev/davinci9 --device /dev/davinci10 --device /dev/davinci11   --device /dev/davinci12 --device /dev/davinci13 --device /dev/davinci14 --device /dev/davinci15   --device /dev/davinci_manager --device /dev/devmm_svm --device /dev/hisi_hdc   -v /usr/local/dcmi:/usr/local/dcmi   -v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool   -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi   -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/   -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info   -v /etc/ascend_install.info:/etc/ascend_install.info   -v /mnt/sfs_turbo/.cache:/home/cache   -v /public-nfs:/public-nfs   -d harbor.baai.ac.cn/flagrelease-public/flagrelease-ascend-minimax sleep infinity
+docker exec -it flagos bash
+```
+### Start the Server
+```bash
+export HCCL_OP_EXPANSION_MODE="AIV"
+export HCCL_BUFFSIZE=1024
+export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True
+export OMP_NUM_THREADS=1
+echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+sysctl -w vm.swappiness=0
+sysctl -w kernel.numa_balancing=0
+sysctl kernel.sched_migration_cost_ns=50000
+export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libjemalloc.so.2:$LD_PRELOAD
+export TASK_QUEUE_ENABLE=1
+
+export VLLM_ASCEND_ENABLE_FUSED_MC2=1
+export VLLM_ASCEND_ENABLE_FLASHCOMM1=1
+export VLLM_ASCEND_BALANCE_SCHEDULING=1
+
+vllm serve /public-nfs/MiniMax-M2.7 \
+    --served-model-name "MiniMax-M2.7" \
+    --host 0.0.0.0 \
+    --port 8000 \
+    --trust-remote-code \
+    --async-scheduling \
+    --compilation-config '{"cudagraph_mode": "FULL_DECODE_ONLY"}' \
+    --additional-config '{"enable_cpu_binding":true}' \
+    --enable-expert-parallel \
+    --tensor-parallel-size 8 \
+    --data-parallel-size 2 \
+    --max-num-seqs 24 \
+    --max-model-len 40690 \
+    --max-num-batched-tokens 16384 \
+    --gpu-memory-utilization 0.85
+```
+
+## Service Invocation
+### Invocation Script
+```bash
+curl http://0.0.0.0:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "MiniMax-M2.7",
+    "messages": [{"role": "user", "content": "你好"}]
+  }'
+```
+
+
+### AnythingLLM Integration Guide
+
+#### 1. Download & Install
+
+- Visit the official site: https://anythingllm.com/
+- Choose the appropriate version for your OS (Windows/macOS/Linux)
+- Follow the installation wizard to complete the setup
+
+#### 2. Configuration
+
+- Launch AnythingLLM
+- Open settings (bottom left, fourth tab)
+- Configure core LLM parameters
+- Click "Save Settings" to apply changes
+
+#### 3. Model Interaction
+
+- After model loading is complete:
+- Click **"New Conversation"**
+- Enter your question (e.g., “Explain the basics of quantum computing”)
+- Click the send button to get a response
+# Technical Overview
+**FlagOS** is a fully open-source system software stack designed to unify the "model–system–chip" layers and foster an open, collaborative ecosystem. It enables a “develop once, run anywhere” workflow across diverse AI accelerators, unlocking hardware performance, eliminating fragmentation among vendor-specific software stacks, and substantially lowering the cost of porting and maintaining AI workloads. With core technologies such as the **FlagScale**, together with vllm-plugin-fl, distributed training/inference framework, **FlagGems** universal operator library, **FlagCX** communication library, and **FlagTree** unified compiler, the **FlagRelease** platform leverages the **FlagOS** stack to automatically produce and release various combinations of \<chip + open-source model\>. This enables efficient and automated model migration across diverse chips, opening a new chapter for large model deployment and application.
+## FlagGems
+FlagGems is a high-performance, generic operator libraryimplemented in [Triton](https://github.com/openai/triton) language. It is built on a collection of backend-neutralkernels that aims to accelerate LLM (Large-Language Models) training and inference across diverse hardware platforms.
+## FlagTree
+FlagTree is an open source, unified compiler for multipleAI chips project dedicated to developing a diverse ecosystem of AI chip compilers and related tooling platforms, thereby fostering and strengthening the upstream and downstream Triton ecosystem. Currently in its initial phase, the project aims to maintain compatibility with existing adaptation solutions while unifying the codebase to rapidly implement single-repository multi-backend support. Forupstream model users, it provides unified compilation capabilities across multiple backends; for downstream chip manufacturers, it offers examples of Triton ecosystem integration.
+## FlagScale and vllm-plugin-fl
+Flagscale is a comprehensive toolkit designed to supportthe entire lifecycle of large models. It builds on the strengths of several prominent open-source projects, including [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) and [vLLM](https://github.com/vllm-project/vllm), to provide a robust, end-to-end solution for managing and scaling large models.
+vllm-plugin-fl is a vLLM plugin built on the FlagOS unified multi-chip backend, to help flagscale support multi-chip on vllm framework.
+## **FlagCX**
+FlagCX is a scalable and adaptive cross-chip communication library. It serves as a platform where developers, researchers, and AI engineers can collaborate on various projects, contribute to the development of cutting-edge AI solutions, and share their work with the global community.
+
+## **FlagEval Evaluation Framework**
+ FlagEval is a comprehensive evaluation system and open platform for large models launched in 2023. It aims to establish scientific, fair, and open benchmarks, methodologies, and tools to help researchers assess model and training algorithm performance. It features:
+ - **Multi-dimensional Evaluation**: Supports 800+ modelevaluations across NLP, CV, Audio, and Multimodal fields,covering 20+ downstream tasks including language understanding and image-text generation.
+ - **Industry-Grade Use Cases**: Has completed horizonta1 evaluations of mainstream large models, providing authoritative benchmarks for chip-model performance validation.
+# Contributing
+
+We warmly welcome global developers to join us:
+
+1. Submit Issues to report problems
+2. Create Pull Requests to contribute code
+3. Improve technical documentation
+4. Expand hardware adaptation support
+# License
+本模型的权重来源于MiniMaxAI/MiniMax-M2.7，以apache2.0协议开源: https://www.apache.org/licenses/LICENSE-2.0.txt。
diff --git a/docs/flagrelease_en/model_readmes/FlagRelease_MiniMax-M2.7-hygon-FlagOS.md b/docs/flagrelease_en/model_readmes/FlagRelease_MiniMax-M2.7-hygon-FlagOS.md
@@ -0,0 +1,126 @@
+---
+base_model:
+- ""
+---
+# Introduction
+MiniMax M2.7 is the latest-generation model in the M2 series, as well as the first model in the series to deeply participate in its own iteration. It can autonomously build complex Agent Harnesses and Skills, update its own Memory, and drive self-iteration through reinforcement learning, forming a closed loop of "model-driven model evolution".
+In terms of capabilities, M2.7 covers the entire software engineering workflow from code generation and log troubleshooting to end-to-end project delivery, achieving a score of 56.22% on the SWE-Pro benchmark, on par with GPT-5.3-Codex. It also delivers strong performance in professional office scenarios, ranking behind only Opus4.6, Sonnet4.6 and GPT-5.4 on the GDPval-AA metric, while maintaining a 97% instruction-following rate across 40 complex Skills scenarios involving more than 2000 tokens.
+### Integrated Deployment
+- Out-of-the-box inference scripts with pre-configured hardware and software parameters	
+- Released **FlagOS-Hygon** container image supporting deployment within minutes
+### Consistency Validation
+- Rigorously evaluated through benchmark testing: Performance and results from the FlagOS software stack are compared against native stacks on multiple public.	
+
+# Evaluation Results
+## Benchmark Result
+|Metrics|MiniMax-M2.7-Nvidia-Origin|MiniMax-M2.7-Hygon-FlagOS|
+|-------|---------------|---------------|
+|GPQA_Diamond |0.7071 |0.5758|
+|Aime24  | 0.9  | 0.9667|
+
+# User Guide
+Environment Setup
+
+| Item             | Version              |
+|------------------|----------------------|
+| Docker Version   | Docker version 20.10.24, build 297e128 |
+| Operating System | Sugon OS 8.9  |
+
+## Operation Steps
+
+### Download FlagOS Image
+```bash
+docker pull harbor.baai.ac.cn/flagrelease-public/flagrelease-hygon-minimax:202604120035
+```
+
+### Download Open-source Model Weights
+```bash
+pip install modelscope
+modelscope download --model FlagRelease/MiniMax-M2.7-hygon-FlagOS --local_dir /data/MiniMax-M2.7
+```
+
+### Start the Container
+```bash
+docker run \
+    --name flagos \
+    --network=host \
+    --ipc=host \
+    --device=/dev/kfd \
+    --device=/dev/mkfd \
+    --device=/dev/dri \
+    -v /opt/hyhal:/opt/hyhal \
+    -v /root/perfxlab:/workspace \
+    -v /data:/data \
+    --group-add video \
+    --cap-add=SYS_PTRACE \
+    --security-opt seccomp=unconfined \
+    -itd \
+    harbor.baai.ac.cn/flagrelease-public/flagrelease-hygon-minimax:202604120035
+
+docker exec -it flagos /bin/bash
+```
+### Start the Server
+```bash
+USE_FLAGGEMS=1 vllm serve /data/MiniMax-M2.7 --tensor-parallel-size 8 --served-model-name minimax-m2.7 --trust-remote-code
+```
+
+## Service Invocation
+### Invocation Script
+```bash
+curl http://localhost:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "minimax-m2.7",
+    "messages": [{"role": "user", "content": "你好"}]
+  }'
+
+```
+
+
+### AnythingLLM Integration Guide
+
+#### 1. Download & Install
+
+- Visit the official site: https://anythingllm.com/
+- Choose the appropriate version for your OS (Windows/macOS/Linux)
+- Follow the installation wizard to complete the setup
+
+#### 2. Configuration
+
+- Launch AnythingLLM
+- Open settings (bottom left, fourth tab)
+- Configure core LLM parameters
+- Click "Save Settings" to apply changes
+
+#### 3. Model Interaction
+
+- After model loading is complete:
+- Click **"New Conversation"**
+- Enter your question (e.g., “Explain the basics of quantum computing”)
+- Click the send button to get a response
+# Technical Overview
+**FlagOS** is a fully open-source system software stack designed to unify the "model–system–chip" layers and foster an open, collaborative ecosystem. It enables a “develop once, run anywhere” workflow across diverse AI accelerators, unlocking hardware performance, eliminating fragmentation among vendor-specific software stacks, and substantially lowering the cost of porting and maintaining AI workloads. With core technologies such as the **FlagScale**, together with vllm-plugin-fl, distributed training/inference framework, **FlagGems** universal operator library, **FlagCX** communication library, and **FlagTree** unified compiler, the **FlagRelease** platform leverages the **FlagOS** stack to automatically produce and release various combinations of \<chip + open-source model\>. This enables efficient and automated model migration across diverse chips, opening a new chapter for large model deployment and application.
+## FlagGems
+FlagGems is a high-performance, generic operator libraryimplemented in [Triton](https://github.com/openai/triton) language. It is built on a collection of backend-neutralkernels that aims to accelerate LLM (Large-Language Models) training and inference across diverse hardware platforms.
+## FlagTree
+FlagTree is an open source, unified compiler for multipleAI chips project dedicated to developing a diverse ecosystem of AI chip compilers and related tooling platforms, thereby fostering and strengthening the upstream and downstream Triton ecosystem. Currently in its initial phase, the project aims to maintain compatibility with existing adaptation solutions while unifying the codebase to rapidly implement single-repository multi-backend support. Forupstream model users, it provides unified compilation capabilities across multiple backends; for downstream chip manufacturers, it offers examples of Triton ecosystem integration.
+## FlagScale and vllm-plugin-fl
+Flagscale is a comprehensive toolkit designed to supportthe entire lifecycle of large models. It builds on the strengths of several prominent open-source projects, including [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) and [vLLM](https://github.com/vllm-project/vllm), to provide a robust, end-to-end solution for managing and scaling large models.
+vllm-plugin-fl is a vLLM plugin built on the FlagOS unified multi-chip backend, to help flagscale support multi-chip on vllm framework.
+## **FlagCX**
+FlagCX is a scalable and adaptive cross-chip communication library. It serves as a platform where developers, researchers, and AI engineers can collaborate on various projects, contribute to the development of cutting-edge AI solutions, and share their work with the global community.
+
+## **FlagEval Evaluation Framework**
+ FlagEval is a comprehensive evaluation system and open platform for large models launched in 2023. It aims to establish scientific, fair, and open benchmarks, methodologies, and tools to help researchers assess model and training algorithm performance. It features:
+ - **Multi-dimensional Evaluation**: Supports 800+ modelevaluations across NLP, CV, Audio, and Multimodal fields,covering 20+ downstream tasks including language understanding and image-text generation.
+ - **Industry-Grade Use Cases**: Has completed horizonta1 evaluations of mainstream large models, providing authoritative benchmarks for chip-model performance validation.
+# Contributing
+
+We warmly welcome global developers to join us:
+
+1. Submit Issues to report problems
+2. Create Pull Requests to contribute code
+3. Improve technical documentation
+4. Expand hardware adaptation support
+# License
+本模型的权重来源于MiniMaxAI/MiniMax-M2.7，以apache2.0协议开源: https://www.apache.org/licenses/LICENSE-2.0.txt。