Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .current_experiment_id
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
cicd_1781407900
268 changes: 268 additions & 0 deletions cell_output.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
warning: `VIRTUAL_ENV=/tmp/builds/YQxxH4yPp/0/omniml/integration/nmm-sandbox/.venv-intern-agent` does not match the project environment path `.venv` and will be ignored; use `--active` to target the active environment instead
Using CPython 3.12.13 interpreter at: /usr/local/bin/python
Creating virtual environment at: .venv
warning: No `requires-python` value found in the workspace. Defaulting to `>=3.12`.
Updating https://github.com/NVIDIA-NeMo/Run (HEAD)
Updated https://github.com/NVIDIA-NeMo/Run (1e26b6a98a756575c10a9a0ea9661fac0c7ad776)
warning: Failed to hardlink files; falling back to full copy. This may lead to degraded performance.
If the cache and target directories are on different filesystems, hardlinking may not be supported.
If this is intentional, set `export UV_LINK_MODE=copy` or use `--link-mode=copy` to suppress this warning.
Installed 149 packages in 2.87s
Configuring global options
Dry run for task __main__:cicd
Resolved Arguments
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Argument Name ┃ Resolved Value ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ detach │ True │
│ hf_local │ None │
│ identity │ '/.ssh/id_ed25519' │
│ job_dir │ '/lustre/fsw/portfolios/coreai/users/chenhany/experiment… │
│ job_name │ 'NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mt… │
│ pipeline │ SandboxPipeline( │
│ │ global_vars=GlobalVariables( │
│ │ hf_model='/hf-local/nvidia/NVIDIA-Nemotron-3-Super-1… │
│ │ task_0=SandboxTask0( │
│ │ script='common/specdec_bench/run.sh', │
│ │ slurm_config=SlurmConfig( │
│ │ host='cw-dfw-cs-001-login-01.nvidia.com', │
│ │ account='coreai_dlalgo_modelopt', │
│ │ partition='batch', │
│ │ container='vllm/vllm-openai:v0.22.1', │
│ │ modelopt_install_path='/usr/local/lib/python3.12/d… │
│ │ container_mounts=['/lustre/fsw/portfolios/coreai/p… │
│ │ '/lustre:/lustre', '/cm:/cm', │
│ │ '/var/run/munge:/var/run/munge'], │
│ │ srun_args=['--no-container-mount-home'], │
│ │ array=None, │
│ │ nodes=1, │
│ │ ntasks_per_node=1, │
│ │ gpus_per_node=4), │
│ │ args=['--dataset speed', '--dataset_path │
│ │ /hf-local/nvidia/SPEED-Bench-Internal/qualitative', │
│ │ '--engine VLLM', '--speculative_algorithm MTP', │
│ │ '--draft_length 3', '--tp_size 4', '--ep_size 1', │
│ │ '--concurrency 32', '--output_length 4096', │
│ │ '--aa_timing', '--show_progress', '--save_dir │
│ │ /scratchspace/{sweep_name_default}/qualitative', │
│ │ '--temperature 0', '--max_seq_len 65536', '--save_dir │
│ │ /scratchspace/NVIDIA-Nemotron-3-Super-120B-A12B-BF16_mtp… │
│ │ '--draft_length 7'], │
│ │ environment=[{'HF_MODEL_CKPT': │
│ │ '<<global_vars.hf_model>>'}, {'HF_LOCAL': '/hf-local'}]), │
│ │ task_1=SandboxTask1( │
│ │ script='common/specdec_bench/run.sh', │
│ │ slurm_config=SlurmConfig( │
│ │ host='cw-dfw-cs-001-login-01.nvidia.com', │
│ │ account='coreai_dlalgo_modelopt', │
│ │ partition='batch', │
│ │ container='vllm/vllm-openai:v0.22.1', │
│ │ modelopt_install_path='/usr/local/lib/python3.12/d… │
│ │ container_mounts=['/lustre/fsw/portfolios/coreai/p… │
│ │ '/lustre:/lustre', '/cm:/cm', │
│ │ '/var/run/munge:/var/run/munge'], │
│ │ srun_args=['--no-container-mount-home'], │
│ │ array=None, │
│ │ nodes=1, │
│ │ ntasks_per_node=1, │
│ │ gpus_per_node=4), │
│ │ args=['--dataset speed', '--dataset_path │
│ │ /hf-local/nvidia/SPEED-Bench-Internal/throughput_32k', │
│ │ '--engine VLLM', '--speculative_algorithm MTP', │
│ │ '--draft_length 3', '--tp_size 4', '--ep_size 1', │
│ │ '--concurrency 8', '--num_requests 80', '--output_length │
│ │ 4096', '--aa_timing', '--show_progress', '--save_dir │
│ │ /scratchspace/{sweep_name_default}/throughput_32k', │
│ │ '--temperature 0', '--max_seq_len 65536', '--save_dir │
│ │ /scratchspace/NVIDIA-Nemotron-3-Super-120B-A12B-BF16_mtp… │
│ │ '--num_requests 80', '--draft_length 7'], │
│ │ environment=[{'HF_MODEL_CKPT': │
│ │ '<<global_vars.hf_model>>'}, {'HF_LOCAL': '/hf-local'}])) │
│ task │ None │
│ test_level │ 0 │
│ user │ 'chenhany' │
└──────────────────┴───────────────────────────────────────────────────────────┘
Launching cicd...
============================================================
Version Report
============================================================
Launcher e5bcf04 (main)
Model-Optimizer 7fa55f475 (pensieve-intern/OMNIML-5095/cell-t0-d7)
============================================================
────────────── Entering Experiment cicd with id: cicd_1781409994 ───────────────
job NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm task 0 slurm_config: SlurmConfig(host='cw-dfw-cs-001-login-01.nvidia.com', port=22, account='coreai_dlalgo_modelopt', partition='batch', qos=None, container='vllm/vllm-openai:v0.22.1', modelopt_install_path='/usr/local/lib/python3.12/dist-packages/modelopt', container_mounts=['/lustre/fsw/portfolios/coreai/projects/coreai_dlalgo_modelopt/hf-local:/hf-local', '/lustre:/lustre', '/cm:/cm', '/var/run/munge:/var/run/munge'], srun_args=['--no-container-mount-home'], array=None, nodes=1, ntasks_per_node=1, gpus_per_node=4, time='04:00:00', local=False, segment=None)
job NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm task 1 slurm_config: SlurmConfig(host='cw-dfw-cs-001-login-01.nvidia.com', port=22, account='coreai_dlalgo_modelopt', partition='batch', qos=None, container='vllm/vllm-openai:v0.22.1', modelopt_install_path='/usr/local/lib/python3.12/dist-packages/modelopt', container_mounts=['/lustre/fsw/portfolios/coreai/projects/coreai_dlalgo_modelopt/hf-local:/hf-local', '/lustre:/lustre', '/cm:/cm', '/var/run/munge:/var/run/munge'], srun_args=['--no-container-mount-home'], array=None, nodes=1, ntasks_per_node=1, gpus_per_node=4, time='04:00:00', local=False, segment=None)
find: ‘modules/Megatron-LM/megatron/*’: No such file or directory
find: ‘modules/Megatron-LM/examples/*’: No such file or directory
find: ‘modules/Megatron-LM/*.py’: No such file or directory
find: ‘modules/Model-Optimizer-Internal/**’: No such file or directory
find: ‘modules/Megatron-LM/megatron/*’: No such file or directory
find: ‘modules/Megatron-LM/examples/*’: No such file or directory
find: ‘modules/Megatron-LM/*.py’: No such file or directory
find: ‘modules/Model-Optimizer-Internal/**’: No such file or directory
[04:06:40] Connecting to client.py:257
chenhany@cw-dfw-cs-001-login-01.nvidia.com
[04:06:40] INFO Connected (version 2.0, client transport.py:1786
OpenSSH_8.9p1)
INFO Authentication (publickey) successful! transport.py:1786
INFO rsyncing rsync.py:37
/tmp/pensieve-intern-agent-aw0fjfab/workspace/ex
periments/cicd/cicd_1781409994 to
/lustre/fsw/portfolios/coreai/users/chenhany/exp
eriments/cicd ...
[04:07:05] INFO Successfully ran `rsync -pthrvz --rsh='ssh -i rsync.py:93
/.ssh/id_ed25519 -p 22 '
/tmp/pensieve-intern-agent-aw0fjfab/workspace/ex
periments/cicd/cicd_1781409994
chenhany@cw-dfw-cs-001-login-01.nvidia.com:/lust
re/fsw/portfolios/coreai/users/chenhany/experime
nts/cicd`
[04:07:06] Launching job experiment.py:800
NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_benc
h_mtp_vllm_0 for experiment cicd
[04:07:06] INFO Launched app: launcher.py:116
slurm_tunnel://nemo_run/12789058
Launching job experiment.py:800
NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_benc
h_mtp_vllm_1 for experiment cicd
[SLURM] Job 12789058 - State: PENDING, Estimated start: N/A, Current time: 2026-06-14 04:07:06
INFO Launched app: launcher.py:116
slurm_tunnel://nemo_run/12789059
────────────────── Detaching from Experiment cicd_1781409994. ──────────────────
Task specific cleanup won't be run. experiment.py:1212
Ephemeral logs and artifacts may be lost.
[SLURM] Job 12789059 - State: PENDING, Estimated start: N/A, Current time: 2026-06-14 04:07:06

Experiment Status for cicd_1781409994

Task 0: NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0
- Status: SUBMITTED
- Executor: SlurmExecutor on chenhany@cw-dfw-cs-001-login-01.nvidia.com
- Job id: 12789058
- Local Directory: /tmp/pensieve-intern-agent-aw0fjfab/workspace/experiments/cicd/cicd_1781409994/NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0
- Remote Directory: /lustre/fsw/portfolios/coreai/users/chenhany/experiments/cicd/cicd_1781409994/NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0

Task 1: NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1
- Status: SUBMITTED
- Executor: SlurmExecutor on chenhany@cw-dfw-cs-001-login-01.nvidia.com
- Job id: 12789059
- Local Directory: /tmp/pensieve-intern-agent-aw0fjfab/workspace/experiments/cicd/cicd_1781409994/NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1
- Remote Directory: /lustre/fsw/portfolios/coreai/users/chenhany/experiments/cicd/cicd_1781409994/NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1


# The experiment was run with the following tasks: ['NVIDIA-Nemotron-3-Super-120
# You can inspect and reconstruct this experiment at a later point in time using
experiment = run.Experiment.from_id("cicd_1781409994")
experiment.status() # Gets the overall status
experiment.logs("NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0
experiment.cancel("NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm


# You can inspect this experiment at a later point in time using the CLI as well
nemo experiment status cicd_1781409994
nemo experiment logs cicd_1781409994 0
nemo experiment cancel cicd_1781409994 0

Found 1 experiment(s): cicd_1781409994

=== [2026-06-14 04:07:13] Polling iteration 1/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: RUNNING
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: PENDING

Summary: 0 succeeded, 0 failed, 0 cancelled, 1 running, 1 pending
Waiting 180s before next poll...

=== [2026-06-14 04:10:15] Polling iteration 2/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: RUNNING
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: PENDING

Summary: 0 succeeded, 0 failed, 0 cancelled, 1 running, 1 pending
Waiting 180s before next poll...

=== [2026-06-14 04:13:18] Polling iteration 3/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: RUNNING
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: PENDING

Summary: 0 succeeded, 0 failed, 0 cancelled, 1 running, 1 pending
Waiting 180s before next poll...

=== [2026-06-14 04:16:20] Polling iteration 4/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: RUNNING
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: PENDING

Summary: 0 succeeded, 0 failed, 0 cancelled, 1 running, 1 pending
Waiting 180s before next poll...

=== [2026-06-14 04:19:23] Polling iteration 5/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: RUNNING
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: PENDING

Summary: 0 succeeded, 0 failed, 0 cancelled, 1 running, 1 pending
Waiting 180s before next poll...

=== [2026-06-14 04:22:25] Polling iteration 6/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: RUNNING
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: PENDING

Summary: 0 succeeded, 0 failed, 0 cancelled, 1 running, 1 pending
Waiting 180s before next poll...

=== [2026-06-14 04:25:28] Polling iteration 7/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: SUCCEEDED
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: RUNNING

Summary: 1 succeeded, 0 failed, 0 cancelled, 1 running, 0 pending
Waiting 180s before next poll...

=== [2026-06-14 04:28:31] Polling iteration 8/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: SUCCEEDED
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: RUNNING

Summary: 1 succeeded, 0 failed, 0 cancelled, 1 running, 0 pending
Waiting 180s before next poll...

=== [2026-06-14 04:31:33] Polling iteration 9/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: SUCCEEDED
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: RUNNING

Summary: 1 succeeded, 0 failed, 0 cancelled, 1 running, 0 pending
Waiting 180s before next poll...

=== [2026-06-14 04:34:36] Polling iteration 10/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: SUCCEEDED
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: RUNNING

Summary: 1 succeeded, 0 failed, 0 cancelled, 1 running, 0 pending
Waiting 180s before next poll...

=== [2026-06-14 04:37:38] Polling iteration 11/14400 ===
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_0: SUCCEEDED
cicd_1781409994 / NVIDIA-Nemotron-3-Super-120B-A12B-BF16_specdec_bench_mtp_vllm_1: SUCCEEDED

Summary: 2 succeeded, 0 failed, 0 cancelled, 0 running, 0 pending

All experiments complete.
SUCCEEDED: 2
FAILED: 0
CANCELLED: 0

=== Fetching experiment logs ===
Fetching logs: cicd_1781409994 task 0
Fetching logs: cicd_1781409994 task 1
=== Done fetching logs ===
qualitative Average_AL= 3.4504
qualitative Category_AL coding = 3.8083
qualitative Category_AL humanities = 3.2641
qualitative Category_AL math = 3.7108
qualitative Category_AL multilingual = 4.0035
qualitative Category_AL qa = 3.1859
qualitative Category_AL rag = 3.7782
qualitative Category_AL reasoning = 3.5766
qualitative Category_AL roleplay = 2.8088
qualitative Category_AL stem = 3.271
qualitative Category_AL summarization = 3.5193
qualitative Category_AL writing = 3.0275
throughput_32k Average_AL= 3.6133
throughput_32k Category_AL high_entropy = 3.0085
throughput_32k Category_AL low_entropy = 4.1817
throughput_32k Category_AL mixed = 3.6706
Loading
Loading