Skip to content

Conversation

@kafkayu
Copy link

@kafkayu kafkayu commented Jan 12, 2026

Goal

VLM multi-turn interaction
(Related to #1075)

TODO / Status

Dataset

Reward

  • Add click-based reward in slime/rollout/rm_hub/opencua_click.py

Interactive Environment

  • Add an offline opencua-click environment, providing on-screen–like click feedback

Results

Trained Qwen3-VL-2B-Instruct on the opencua click dataset with multi-turn reasoning, using GRPO.
raw_reward

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant