Skip to content

Commit a764e03

Browse files
committed
docs: improve training docs
1 parent 65669dc commit a764e03

File tree

2 files changed

+35
-18
lines changed

2 files changed

+35
-18
lines changed

docs/tutorials/training-ner.md

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
115115

116116
# 🎛️ OPTIMIZER
117117
optimizer:
118-
"@core": optimizer
118+
"@core": optimizer !draft # (2)!
119119
optim: adamw
120120
groups:
121121
# Assign parameters starting with transformer (ie the parameters of the transformer component)
@@ -133,7 +133,6 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
133133
"warmup_rate": 0.1
134134
"start_value": 3e-4
135135
"max_value": 3e-4
136-
module: ${ nlp }
137136
total_steps: ${ train.max_steps }
138137

139138
# 📚 DATA
@@ -216,6 +215,14 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
216215
1. Why do we use `'@core': pipeline` here ? Because we need the reference used in `optimizer.module = ${ nlp }` to be the actual Pipeline and not its keyword arguments : when confit sees `'@core': pipeline`, it will instantiate the `Pipeline` class with the arguments provided in the dict.
217216

218217
In fact, you could also use `'@core': eds.pipeline` in every config when you define a pipeline, but sometimes it's more convenient to let Confit infer that the type of the nlp argument based on the function when it's type hinted. Not specifying `'@core': pipeline` is also more aligned with `spacy`'s pipeline config API. However, in general, explicit is better than implicit, so feel free to use explicitly write `'@core': eds.pipeline` when you define a pipeline.
218+
1. What does "draft" mean here ? We'll let the train function pass the nlp object
219+
to the optimizer after it has been been `post_init`'ed : `post_init` is the operation that
220+
looks at some data, finds how many label the model must learn, and updates the model weights
221+
to have as many heads as there are labels observed in the train data. This function will be
222+
called by `train`, so the optimizer should be defined *after*, when the model parameter
223+
tensors are final. To do that, instead of instantiating the optimizer right now, we create
224+
a "Draft", which will be instantiated inside the `train` function, once all the required
225+
parameters are set.
219226

220227
To train the model, you can use the following command:
221228

@@ -277,9 +284,8 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
277284

278285
# 🎛️ OPTIMIZER
279286
max_steps = 2000
280-
optimizer = ScheduledOptimizer(
287+
optimizer = ScheduledOptimizer.draft( # (1)!
281288
optim=torch.optim.Adam,
282-
module=nlp,
283289
total_steps=max_steps,
284290
groups=[
285291
{
@@ -333,6 +339,15 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
333339
)
334340
```
335341

342+
1. Wait, what's does "draft" mean here ? We'll let the train function pass the nlp object
343+
to the optimizer after it has been been `post_init`'ed : `post_init` is the operation that
344+
looks at some data, finds how many label the model must learn, and updates the model weights
345+
to have as many heads as there are labels observed in the train data. This function will be
346+
called by `train`, so the optimizer should be defined *after*, when the model parameter
347+
tensors are final. To do that, instead of instantiating the optimizer right now, we create
348+
a "Draft", which will be instantiated inside the `train` function, once all the required
349+
parameters are set.
350+
336351
or use the config file:
337352

338353
```{ .python .no-check }

docs/tutorials/training-span-classifier.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -184,13 +184,14 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
184184
```
185185

186186
1. Put entities extracted by `eds.dates` in `doc.ents`, instead of `doc.spans['dates']`.
187-
2. Wait, what's does "draft" mean here ? The rationale is this: we don't want to
188-
instantiate the optimizer now, because the nlp object hasn't been `post_init`'ed
189-
yet : `post_init` is the operation that looks at some data, finds how many labels the model must learn,
190-
and updates the model weights to have as many heads as there are labels. This function will
191-
be called by `train`, so the optimizer should be defined *after*, when the model parameter tensors are
192-
final. To do that, instead of instantiating the optimizer, we create a "Draft", which will be
193-
instantiated inside the `train` function, once all the required parameters are set.
187+
2. What does "draft" mean here ? We'll let the train function pass the nlp object
188+
to the optimizer after it has been been `post_init`'ed : `post_init` is the operation that
189+
looks at some data, finds how many label the model must learn, and updates the model weights
190+
to have as many heads as there are labels observed in the train data. This function will be
191+
called by `train`, so the optimizer should be defined *after*, when the model parameter
192+
tensors are final. To do that, instead of instantiating the optimizer right now, we create
193+
a "Draft", which will be instantiated inside the `train` function, once all the required
194+
parameters are set.
194195

195196
And train the model:
196197

@@ -309,13 +310,14 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
309310
```
310311

311312
1. Put entities extracted by `eds.dates` in `doc.ents`, instead of `doc.spans['dates']`.
312-
2. Wait, what's does "draft" mean here ? The rationale is this: we don't want to
313-
instantiate the optimizer now, because the nlp object hasn't been `post_init`'ed
314-
yet : `post_init` is the operation that looks at some data, finds how many label the model must learn,
315-
and updates the model weights to have as many heads as there are labels. This function will
316-
be called by `train`, so the optimizer should be defined *after*, when the model parameter tensors are
317-
final. To do that, instead of instantiating the optimizer, we create a "Draft", which will be
318-
instantiated inside the `train` function, once all the required parameters are set.
313+
2. What does "draft" mean here ? We'll let the train function pass the nlp object
314+
to the optimizer after it has been been `post_init`'ed : `post_init` is the operation that
315+
looks at some data, finds how many label the model must learn, and updates the model weights
316+
to have as many heads as there are labels observed in the train data. This function will be
317+
called by `train`, so the optimizer should be defined *after*, when the model parameter
318+
tensors are final. To do that, instead of instantiating the optimizer right now, we create
319+
a "Draft", which will be instantiated inside the `train` function, once all the required
320+
parameters are set.
319321

320322

321323
!!! note "Upstream annotations at training vs inference time"

0 commit comments

Comments
 (0)