Skip to content

Commit c58cb11

Browse files
committed
updated examples
1 parent 5016c78 commit c58cb11

13 files changed

+543
-27
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -73,14 +73,14 @@ trained_model: nn.Module = results.rank(0)
7373
torch.save(trained_model.state_dict(), "output/model.pth")
7474
```
7575

76-
**See [training GPT-2 on WikiText](https://torchrunx.readthedocs.io/stable/examples.html#training-gpt-2-on-wikitext) for more examples using the following deep learning libraries:**
77-
- Accelerate
78-
- HF Transformers
79-
- DeepSpeed
80-
- PyTorch Lightning
81-
- MosaicML Composer
82-
83-
**Refer to our [API](https://torchrunx.readthedocs.io/stable/api.html) and [Advanced Usage Guide](https://torchrunx.readthedocs.io/stable/advanced.html) for many more capabilities!**
76+
**See examples where we fine-tune LLMs (e.g. GPT-2 on WikiText) using:**
77+
- [Accelerate](https://torchrun.xyz/examples/accelerate.html)
78+
- [HF Transformers](https://torchrun.xyz/examples/transformers.html)
79+
- [DeepSpeed](https://torchrun.xyz/examples/deepspeed.html)
80+
- [PyTorch Lightning](https://torchrun.xyz/examples/lightning.html)
81+
- [MosaicML Composer](https://torchrun.xyz/examples/composer.html)
82+
83+
**Refer to our [API](https://torchrun.xyz/api.html) and [Advanced Usage Guide](https://torchrun.xyz/advanced.html) for many more capabilities!**
8484

8585
---
8686

docs/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
html_theme = "furo"
77
language = "en"
88

9+
html_extra_path = ["source/examples/scripts"]
10+
911
extensions = [
1012
"autodoc2",
1113
"myst_parser", # support markdown

docs/source/api.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,7 @@
11
# API
22

3-
## Launching functions
4-
53
```{eval-rst}
6-
.. autofunction:: torchrunx.launch(func: Callable, ...)
4+
.. autofunction:: torchrunx.launch
75
```
86

97
We provide the {mod}`torchrunx.Launcher` class as an alias to {mod}`torchrunx.launch`.

docs/source/examples/accelerate.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Accelerate
22

33
```{eval-rst}
4-
.. literalinclude:: ./accelerate_example.py
4+
.. literalinclude:: ./scripts/accelerate_example.py
55
```

docs/source/examples/deepspeed.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# DeepSpeed
22

33
```{eval-rst}
4-
.. literalinclude:: ./deepspeed_example.py
5-
.. literalinclude:: ./deepspeed_config.json
4+
.. literalinclude:: ./scripts/deepspeed_example.py
5+
.. literalinclude:: ./scripts/deepspeed_config.json
66
```

docs/source/examples/lightning.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Pytorch Lightning
22

33
```{eval-rst}
4-
.. literalinclude:: ./lightning_example.py
4+
.. literalinclude:: ./scripts/lightning_example.py
55
```

docs/source/examples/transformers_example.py renamed to docs/source/examples/scripts/transformers_train.py

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@
99
# ]
1010
# ///
1111

12+
# [docs:start-after]
13+
import functools
1214
import os
1315
from typing import Annotated
1416

@@ -25,7 +27,7 @@
2527
import torchrunx
2628

2729

28-
def build_model(name: str = "gpt2") -> PreTrainedModel:
30+
def build_model(name: str) -> PreTrainedModel:
2931
return AutoModelForCausalLM.from_pretrained(name)
3032

3133

@@ -41,6 +43,12 @@ def load_training_data(
4143
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
4244
if tokenizer.pad_token is None:
4345
tokenizer.pad_token = tokenizer.eos_token
46+
tokenize_fn = functools.partial(
47+
tokenizer,
48+
max_length=tokenizer.model_max_length,
49+
truncation=True,
50+
padding="max_length",
51+
)
4452

4553
dataset = load_dataset(path, name=name, split=split)
4654

@@ -50,12 +58,7 @@ def load_training_data(
5058
return (
5159
dataset.select(range(num_samples))
5260
.map(
53-
lambda x: tokenizer(
54-
x[text_column_name],
55-
max_length=tokenizer.model_max_length,
56-
truncation=True,
57-
padding="max_length",
58-
),
61+
tokenize_fn,
5962
batched=True,
6063
input_columns=[text_column_name],
6164
remove_columns=[text_column_name],
@@ -74,6 +77,7 @@ def train(
7477
)
7578
trainer.train()
7679

80+
# TODO: return checkpoint path
7781
if int(os.environ["RANK"]) == 0:
7882
return model
7983

docs/source/examples/transformers.md

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,45 @@
11
# Transformers
22

3+
Here's an example script that uses `torchrunx` with [`transformers.Trainer`](https://huggingface.co/docs/transformers/en/main_classes/trainer) to fine-tune any causal language model (from `transformers`) on any text dataset (from `datasets`) with any number of GPUs or nodes: [https://torchrun.xyz/transformers_train.py](https://torchrun.xyz/transformers_train.py).
4+
5+
You can pass command-line arguments to customize:
6+
- `--launcher`: [torchrunx.Launcher](../api.md#torchrunx.Launcher)
7+
- `--model`: [`transformers.AutoModelForCausalLM`](https://huggingface.co/docs/transformers/en/model_doc/auto#transformers.AutoModelForCausalLM)
8+
- `--dataset`: [`transformers.AutoTokenizer`](https://huggingface.co/docs/transformers/en/model_doc/auto#transformers.AutoTokenizer) and [`datasets.load_dataset`](https://huggingface.co/docs/datasets/en/package_reference/loading_methods#datasets.load_dataset)
9+
- `--trainer`: [`transformers.TrainingArguments`](https://huggingface.co/docs/transformers/en/main_classes/trainer#transformers.TrainingArguments)
10+
11+
The following arguments are required: `--model.name`, `--dataset.tokenizer-name`, `--dataset.path`, `--trainer.output-dir`.
12+
13+
<details>
14+
<summary><p style="display: inline-block;"><code class="docutils literal notranslate"><span class="pre">python transformers_train.py --help</span></code></p> (expand)</summary>
15+
16+
```{eval-rst}
17+
.. literalinclude:: ./transformers_help.txt
18+
```
19+
</details>
20+
21+
Of course, this script is a template: you can also edit the script first, as desired.
22+
23+
### Training GPT-2 on WikiText in One Line
24+
25+
The following one-line command runs our script end-to-end (installing all dependencies, downloading model and data, training, logging to TensorBoard, etc.).
26+
27+
Pre-requisites: [uv](https://docs.astral.sh/uv)
28+
329
```bash
4-
uv run torchrun.xyz/torchrunx_transformers.py \
5-
--launcher.hostnames localhost --launcher.workers-per-host 2 \
6-
--args.output_dir output --args.per-device-train-batch-size 4 --args.report-to tensorboard
30+
uv run https://torchrun.xyz/transformers_train.py \
31+
--model.name gpt2 --dataset.tokenizer-name gpt2 \
32+
--dataset.path "Salesforce/wikitext" --dataset.name "wikitext-2-v1" --dataset.split "train" --dataset.num-samples 80 \
33+
--trainer.output_dir output --trainer.per-device-train-batch-size 4 --trainer.report-to tensorboard
734
```
835

36+
We don't need to pass `--launcher` arguments by default. But if you want to do multi-node training (and are not using SLURM), you can also pass e.g. `--launcher.hostnames node1 node2`.
37+
38+
### Script
39+
40+
[The [raw source code](https://torchrun.xyz/transformers_train.py) also specifies dependencies at the top of the file — in [PEP 723](https://peps.python.org/pep-0723) format — e.g. for `uv` as above.]
41+
942
```{eval-rst}
10-
.. literalinclude:: ./transformers_example.py
43+
.. literalinclude:: ./scripts/transformers_train.py
44+
:start-after: # [docs:start-after]
1145
```

0 commit comments

Comments
 (0)