Qwen3-14B Test Script #532

Beichen-Ma · 2026-01-30T00:58:51Z

As #530, try to reproduce with scripts/run-qwen3-14B.sh. The script succeeds with no failure.

gemini-code-assist · 2026-01-30T00:59:04Z

Summary of Changes

Hello @Beichen-Ma, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a dedicated test script for the Qwen3-14B model, designed to validate its functionality and reproduce a specific issue. The script orchestrates a complex training and evaluation workflow using Ray for distributed execution, configuring numerous parameters for model loading, data processing, performance tuning, and optimization. Its primary impact is to provide a robust and reproducible environment for testing this particular model.

Highlights

New Test Script Added: A new bash script, scripts/run-qwen3-14B.sh, has been introduced to facilitate testing and reproduction of issues related to the Qwen3-14B model.
Ray and SGLang Integration: The script includes commands to manage Ray and SGLang processes, ensuring a clean execution environment and leveraging Ray for distributed job submission.
Comprehensive Model Configuration: Various argument arrays are defined within the script to configure checkpoint paths, rollout parameters, evaluation settings, performance optimizations (e.g., tensor parallelism, recomputation), GRPO, Adam optimizer, and SGLang specific settings for the train.py script.
Environment Variable Setup: The script dynamically sets critical environment variables like PYTHONPATH, CUDA_DEVICE_MAX_CONNECTIONS, and NCCL_NVLS_ENABLE based on NVLink detection, ensuring proper runtime configuration.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Beichen-Ma · 2026-01-30T00:59:22Z

Hi @zhaochenyang20, I put the script here for reference.

gemini-code-assist

Code Review

This pull request introduces a test script for Qwen3-14B. A critical security concern has been identified: the Ray dashboard and its Job Submission API are exposed to the network without authentication, which could allow unauthorized users to execute arbitrary code on the training cluster. Beyond this, the review also addresses general issues to improve the script's robustness, correctness, and portability, such as correcting an environment variable for Python output buffering, removing redundant cleanup commands, parameterizing hardcoded paths, and fixing a hardcoded IP in the ray job submit command. Minor shell scripting best practice issues, like unquoted variables, were also noted.

gemini-code-assist · 2026-01-30T01:01:09Z

scripts/run-qwen3-14B.sh

+
+# launch the master node of ray in container
+export MASTER_ADDR=${MASTER_ADDR:-"127.0.0.1"}
+ray start --head --node-ip-address ${MASTER_ADDR} --num-gpus 8 --disable-usage-stats --dashboard-host=0.0.0.0 --dashboard-port=8265


The Ray dashboard is configured to listen on all interfaces (0.0.0.0) without authentication via the --dashboard-host=0.0.0.0 flag. The Ray dashboard includes a Job Submission API that allows executing arbitrary code on the cluster. Since Ray does not enable authentication for the dashboard by default, exposing it to the network allows any user with network access to achieve Remote Code Execution (RCE) on the training cluster. It is recommended to bind the dashboard to 127.0.0.1 and use SSH tunneling for remote access, or ensure the dashboard is protected by a firewall. Additionally, the ${MASTER_ADDR} variable should be quoted to prevent word splitting.

Suggested change

ray start --head --node-ip-address ${MASTER_ADDR} --num-gpus 8 --disable-usage-stats --dashboard-host=0.0.0.0 --dashboard-port=8265

ray start --head --node-ip-address "${MASTER_ADDR}" --num-gpus 8 --disable-usage-stats --dashboard-host=127.0.0.1 --dashboard-port=8265

gemini-code-assist · 2026-01-30T01:01:10Z

scripts/run-qwen3-14B.sh

+  }
+}"
+
+ray job submit --address="http://127.0.0.1:8265" \


The ray job submit command uses a hardcoded IP address 127.0.0.1. However, MASTER_ADDR is made configurable earlier in the script for ray start. If MASTER_ADDR is set to something other than 127.0.0.1, this command will fail to connect to the Ray head node. To ensure correctness, you should use the ${MASTER_ADDR} variable here.

Suggested change

ray job submit --address="http://127.0.0.1:8265" \

ray job submit --address="http://${MASTER_ADDR}:8265" \

gemini-code-assist · 2026-01-30T01:01:10Z

scripts/run-qwen3-14B.sh

+sleep 3
+pkill -9 ray
+pkill -9 python


These pkill commands are redundant. The ray stop --force and the first set of pkill commands on lines 6-8 should be sufficient to clean up processes from a previous run. Repeating them is unnecessary.

gemini-code-assist · 2026-01-30T01:01:10Z

scripts/run-qwen3-14B.sh

+set -ex
+
+# will prevent ray from buffering stdout/stderr
+export PYTHONBUFFERED=16


The environment variable to disable python's output buffering is PYTHONUNBUFFERED, not PYTHONBUFFERED. The value is typically set to 1. The current setting is likely having no effect.

Suggested change

export PYTHONBUFFERED=16

export PYTHONUNBUFFERED=1

gemini-code-assist · 2026-01-30T01:01:10Z

scripts/run-qwen3-14B.sh

+   --hf-checkpoint /root/Qwen3-14B
+   #--hf-checkpoint /root/Qwen3-4B-FP8
+   --ref-load /root/Qwen3-14B_torch_dist
+   --load /root/Qwen3-14B_miles/
+   --save /root/Qwen3-14B_miles/
+   --save-interval 20


The paths here are hardcoded, which makes the script less portable. It's a good practice to parameterize these paths using an environment variable for the root directory. You can define a variable like DATA_ROOT=${DATA_ROOT:-/root} at the beginning of the script and use it here. This principle applies to other hardcoded paths in the script as well.

Suggested change

--hf-checkpoint /root/Qwen3-14B

#--hf-checkpoint /root/Qwen3-4B-FP8

--ref-load /root/Qwen3-14B_torch_dist

--load /root/Qwen3-14B_miles/

--save /root/Qwen3-14B_miles/

--save-interval 20

--hf-checkpoint "${DATA_ROOT}/Qwen3-14B"

#--hf-checkpoint "${DATA_ROOT}/Qwen3-4B-FP8"

--ref-load "${DATA_ROOT}/Qwen3-14B_torch_dist"

--load "${DATA_ROOT}/Qwen3-14B_miles/"

--save "${DATA_ROOT}/Qwen3-14B_miles/"

--save-interval 20

script

ecd3513

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen3-14B Test Script #532

Qwen3-14B Test Script #532

Uh oh!

Beichen-Ma commented Jan 30, 2026

Uh oh!

gemini-code-assist bot commented Jan 30, 2026

Uh oh!

Beichen-Ma commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 30, 2026

Uh oh!

gemini-code-assist bot Jan 30, 2026

Uh oh!

gemini-code-assist bot Jan 30, 2026

Uh oh!

gemini-code-assist bot Jan 30, 2026

Uh oh!

gemini-code-assist bot Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	ray start --head --node-ip-address ${MASTER_ADDR} --num-gpus 8 --disable-usage-stats --dashboard-host=0.0.0.0 --dashboard-port=8265
	ray start --head --node-ip-address "${MASTER_ADDR}" --num-gpus 8 --disable-usage-stats --dashboard-host=127.0.0.1 --dashboard-port=8265

	ray job submit --address="http://127.0.0.1:8265" \
	ray job submit --address="http://${MASTER_ADDR}:8265" \

Qwen3-14B Test Script #532

Are you sure you want to change the base?

Qwen3-14B Test Script #532

Uh oh!

Conversation

Beichen-Ma commented Jan 30, 2026

Uh oh!

gemini-code-assist bot commented Jan 30, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Beichen-Ma commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant