Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement

Paper: 📖 Seg-Zero
HuggingFace Daily: 🤗 Seg-Zero
Data: 🤗 RefCOCOg-2K
Model: 🤗 Seg-Zero-7B

Overview of Seg-Zero:

Seg-Zero demonstrates following features:

Seg-Zero exhibits emergent test-time reasoning ability. It generates a reasoning chain before producing the final segmentation mask.
Seg-Zero is trained exclusively using reinforcement learning, without any explicit supervised reasoning data.
Compared to supervised fine-tuning, our Seg-Zero achieves superior performance on both in-domain and out-of-domain data.

Highlight Code Features:

This code is based on the EasyR1 and veRL, which supports model split during sampling and is more GPU memory friendly.
Supporting both Qwen2-VL and Qwen2.5-VL series models.
Already implementing commonly used rewards in Object Detection and Object Segmentation, including IoU reward and L1 reward.

News

[March 11th, 2025] 🔥 Paper is coming!
[March 8th, 2025] 🔥 Seg-Zero is coming! We have released the code and training data.

Model

Seg-Zero employs a decoupled architecture, including a reasoning model and segmentation model. We manually design a sophiscated reward mechanism that integrates both the format and accuracy rewards.

Examples

Installation

git clone https://github.com/dvlab-research/Seg-Zero.git
cd Seg-Zero
conda create -n seg_zero python=3.11
conda activate seg_zero
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1
pip install -e .
pip install sam2
pip install matplotlib

Inference

python inference_scripts/infer.py

The default question is

"the unusal object in the image."

You will get the thinking process in command line, like:

"The image shows a bicycle with wheels that have been replaced with large, round objects resembling watermelon slices. The unusual aspect of the image is the substitution of the bicycle wheels with these watermelon-like objects, which is not a typical feature of a bicycle. The rest of the bicycle appears to be a standard design, but the wheels are the focal point of the image."

And the mask will be presented in inference_scripts folder.

You can also provide your own image_path and text by:

python inference_scripts/infer.py --image_path "your_image_path" --text "your question text"

Training

1. GRPO Training

bash training_scripts/run_qwen2_5_3b_refCOCOg.sh

You can try change the following hyper-parameters if you have a large GPU memory.

worker.actor.micro_batch_size_per_device_for_update=4 or 8 or 16 \
worker.actor.micro_batch_size_per_device_for_experience=4 or 8 or 16 \

If your GPU has less memory, you can change the following config. The number is depend on your GPU memory.

worker.rollout.tensor_parallel_size=[your number between 1-8]
worker.rollout.gpu_memory_utilization=[your number between 0-1]
worker.rollout.n=[your number between 4-32]

2. Merge Checkpoint in Hugging Face Format

python3 training_scripts/model_merger.py --local_dir [path_to_your_actor_checkpoint]

Tip

If you encounter issues with connecting to Hugging Face, consider using export HF_ENDPOINT=https://hf-mirror.com.

The GRPO Algorithm

Seg-Zero generates several samples, calculates the rewards and then optimizes towards samples that achieve higher rewards.

Tip

To learn more about the GRPO algorithm, you can refer to Hugging Face's blog.

Citation

@article{liu2025segzero,
  title        = {Seg-Zero: Reasoning-Chain Guided  Segmentation via Cognitive Reinforcement},
  author       = {Liu, Yuqi and Peng, Bohao and Zhong, Zhisheng and Yue, Zihao and Lu, Fanbin and Yu, Bei and Jia, Jiaya},
  journal      = {arXiv preprint arXiv:2503.06520},
  year         = {2025}
}

Acknowledgement

We would like to thank the following repos for their great work:

This work is built upon the EasyR1 and veRL.
This work utilizes models from Qwen2-VL, Qwen2.5-VL and SAM2.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
inference_scripts		inference_scripts
training_scripts		training_scripts
verl		verl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement

News

Contents

Model

Examples

Installation

Inference

Training

1. GRPO Training

2. Merge Checkpoint in Hugging Face Format

The GRPO Algorithm

Citation

Acknowledgement

Star History

About

Releases

Packages

Languages

License

dvlab-research/Seg-Zero

Folders and files

Latest commit

History

Repository files navigation

Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement

News

Contents

Model

Examples

Installation

Inference

Training

1. GRPO Training

2. Merge Checkpoint in Hugging Face Format

The GRPO Algorithm

Citation

Acknowledgement

Star History

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages