Skip to content

Commit 63a91d7

Browse files
[float8 training] update torchtitan benchmark script args (#2392)
update torchtitan benchmark script args
1 parent bc80a5d commit 63a91d7

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

benchmarks/float8/training/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Training parameters can be configured via environment variables.
1212
- `TORCHTITAN_ROOT`: Root directory of torchtitan in your local filesystem
1313
- Optional:
1414
- `FLOAT8_RECIPE_WITH_BEST_SETTINGS`: "rowwise" or "tensorwise". Applies float8 training with the specified scaling recipe, as well as additional training configs which are optimal for that scaling recipe. See `torchtitan_benchmark.sh` for more details.
15-
- `BATCH_SIZE`: Defaults to 1.
15+
- `LOCAL_BATCH_SIZE`: Defaults to 1.
1616
- `STEPS`: Defaults to 100.
1717
- `EXTRA_ARGS`: Extra arguments to pass to torchtitan training script. See [torchtitan](https://github.com/pytorch/torchtitan) docs for the full list of options.
1818

benchmarks/float8/training/torchtitan_benchmark.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
# with the given parameters,
99

1010
# script arguments
11-
BATCH_SIZE=${BATCH_SIZE:-1}
11+
LOCAL_BATCH_SIZE=${LOCAL_BATCH_SIZE:-1}
1212
STEPS=${STEPS:-100}
1313

1414
# temporary log file which is deleted after performance data is parsed out and metrics are calculated.
@@ -20,7 +20,7 @@ if [ -z "${TORCHTITAN_ROOT}" ]; then
2020
echo "Usage: TORCHTITAN_ROOT=<directory> ./float8_training_benchmark.sh"
2121
echo "Optional parameters configurable via environment variables:"
2222
echo " * FLOAT8_RECIPE_WITH_BEST_SETTINGS: "rowwise" or "tensorwise". if set, use float8 training in torchtitan with the specified recipe, including the additional settings which are optimal for that recipe. otherwise, use bf16 mixed precision training."
23-
echo " * BATCH_SIZE: defaults to 1."
23+
echo " * LOCAL_BATCH_SIZE: defaults to 1."
2424
echo " * STEPS: defaults to 100."
2525
echo " * EXTRA_ARGS: additional arguments to pass to the torchtitan training script."
2626
exit 1
@@ -45,7 +45,7 @@ cd ${TORCHTITAN_ROOT}
4545
echo "float8 args: ${FLOAT8_ARGS}"
4646

4747
# run the command with the specified arguments
48-
CONFIG_FILE="./torchtitan/models/llama3/train_configs/llama3_8b.toml" ${TORCHTITAN_ROOT}/run_train.sh --training.steps=${STEPS} --training.batch_size=${BATCH_SIZE} --training.compile ${FLOAT8_ARGS} ${EXTRA_ARGS} 2>&1 | tee ${LOG_FILE}
48+
CONFIG_FILE="./torchtitan/models/llama3/train_configs/llama3_8b.toml" ${TORCHTITAN_ROOT}/run_train.sh --training.steps=${STEPS} --training.local-batch-size=${LOCAL_BATCH_SIZE} --training.compile ${FLOAT8_ARGS} ${EXTRA_ARGS} 2>&1 | tee ${LOG_FILE}
4949

5050
# return to original working directory
5151
cd $original_dir

0 commit comments

Comments
 (0)