[Feat] Adding minimal training for multimodal model #136

kcz358 · 2025-01-31T06:03:18Z

This PR integrates the training of Qwen2-VL into the open-r1. Due to the implementation of the current GRPO trainer, I have to hack the loading and processing logic of the model inside the trainer and create a trainer class.

The original training code is in :
https://github.com/EvolvingLMMs-Lab/open-r1-multimodal

The logs, checkpoints, and dataset used are available as follows:

Logs : Wandb Logs
Models : 🤗 Models
Datasets : 🤗 Datasets

The runs of the train can be done by torchrun

torchrun --nproc_per_node="8" \
    --nnodes="1" \
    --node_rank="0" \
    --master_addr="127.0.0.1" \
    --master_port="12345" \
    src/open_r1/grpo.py \
    --deepspeed scripts/zero3.json \
    --output_dir checkpoints/Qwen2-VL-2B-GRPO-8k \
    --model_name_or_path Qwen/Qwen2-VL-2B-Instruct \
    --dataset_name lmms-lab/multimodal-open-r1-8k-verified \
    --max_prompt_length 8192 \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 1 \
    --logging_steps 1 \
    --bf16 \
    --report_to wandb \
    --gradient_checkpointing true \
    --attn_implementation flash_attention_2 \
    --max_pixels 2359296 \
    --save_total_limit 8 \
    --num_train_epochs 1 \
    --run_name Qwen2-VL-2B-GRPO-8k

Luodian · 2025-02-02T12:16:30Z

@qgallouedec @lewtun

We truly appreciate this project. Thanks for your efforts!

We are wondering if you could consider adding multimodal models training to the project. It would be exciting to see the multimodal reasoning research derived from this project. We’d love to hear your thoughts on this!

Hynek Kydlicek and others added 11 commits January 31, 2025 04:14

bump up deps, fix aime24 evals, make grpo more strict

02b9c0e

minor fixes

fe3948a

🤨 fmt

462bd62

Allow to train Qwen2 VL on text

6a6a0cb

Load Qwen2 VL with processor

8418dd2

Support image training

a3e5b3a

Pass in max pixels and min pixels

74b7106

Rename GRPO Trainer

b011ae2

Add Aria

507c6f9

Compatible to main

8ffc9bd

update

6235cb3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Adding minimal training for multimodal model #136

[Feat] Adding minimal training for multimodal model #136

kcz358 commented Jan 31, 2025 •

edited

Loading

Luodian commented Feb 2, 2025

[Feat] Adding minimal training for multimodal model #136

Are you sure you want to change the base?

[Feat] Adding minimal training for multimodal model #136

Conversation

kcz358 commented Jan 31, 2025 • edited Loading

Luodian commented Feb 2, 2025

kcz358 commented Jan 31, 2025 •

edited

Loading