Skip to content

Commit a23ab5b

Browse files
committed
clean up
1 parent 4f8e621 commit a23ab5b

File tree

2 files changed

+3
-6
lines changed

2 files changed

+3
-6
lines changed

run_train.sh

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,8 @@ set -ex
1010
# use envs as local overwrites for convenience
1111
# e.g.
1212
# LOG_RANK=0,1 NGPU=4 ./run_train.sh
13-
# NGPU=${NGPU:-"8"}
14-
NGPU=${NGPU:-"4"}
15-
# export LOG_RANK=${LOG_RANK:-0,1,2,3,4,5,6,7}
16-
# export LOG_RANK=${LOG_RANK:-0,1,2,3}
17-
export LOG_RANK=${LOG_RANK:-0,1,2,3}
13+
NGPU=${NGPU:-"8"}
14+
export LOG_RANK=${LOG_RANK:-0}
1815
CONFIG_FILE=${CONFIG_FILE:-"./torchtitan/models/llama3/train_configs/debug_model.toml"}
1916
TRAIN_FILE=${TRAIN_FILE:-"torchtitan.train"}
2017

torchtitan/tools/profiling.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
from torchtitan.tools.logging import logger
1616

1717
# the number of warmup steps before the active step in each profiling cycle
18-
WARMUP = 0
18+
WARMUP = 3
1919

2020
# how much memory allocation/free ops to record in memory snapshots
2121
MEMORY_SNAPSHOT_MAX_ENTRIES = 100000

0 commit comments

Comments
 (0)