You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/training-ner.md
+19-4Lines changed: 19 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -115,7 +115,7 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
115
115
116
116
# 🎛️ OPTIMIZER
117
117
optimizer:
118
-
"@core": optimizer
118
+
"@core": optimizer !draft # (2)!
119
119
optim: adamw
120
120
groups:
121
121
# Assign parameters starting with transformer (ie the parameters of the transformer component)
@@ -133,7 +133,6 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
133
133
"warmup_rate": 0.1
134
134
"start_value": 3e-4
135
135
"max_value": 3e-4
136
-
module: ${ nlp }
137
136
total_steps: ${ train.max_steps }
138
137
139
138
# 📚 DATA
@@ -216,6 +215,14 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
216
215
1. Why do we use `'@core': pipeline` here ? Because we need the reference used in `optimizer.module = ${ nlp }` to be the actual Pipeline and not its keyword arguments : when confit sees `'@core': pipeline`, it will instantiate the `Pipeline` class with the arguments provided in the dict.
217
216
218
217
In fact, you could also use `'@core': eds.pipeline` in every config when you define a pipeline, but sometimes it's more convenient to let Confit infer that the type of the nlp argument based on the function when it's type hinted. Not specifying `'@core': pipeline` is also more aligned with `spacy`'s pipeline config API. However, in general, explicit is better than implicit, so feel free to use explicitly write `'@core': eds.pipeline` when you define a pipeline.
218
+
1. What does "draft" mean here ? We'll let the train function pass the nlp object
219
+
to the optimizer after it has been been `post_init`'ed : `post_init` is the operation that
220
+
looks at some data, finds how many label the model must learn, and updates the model weights
221
+
to have as many heads as there are labels observed in the train data. This function will be
222
+
called by `train`, so the optimizer should be defined *after*, when the model parameter
223
+
tensors are final. To do that, instead of instantiating the optimizer right now, we create
224
+
a "Draft", which will be instantiated inside the `train` function, once all the required
225
+
parameters are set.
219
226
220
227
To train the model, you can use the following command:
221
228
@@ -277,9 +284,8 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
277
284
278
285
# 🎛️ OPTIMIZER
279
286
max_steps = 2000
280
-
optimizer = ScheduledOptimizer(
287
+
optimizer = ScheduledOptimizer.draft( # (1)!
281
288
optim=torch.optim.Adam,
282
-
module=nlp,
283
289
total_steps=max_steps,
284
290
groups=[
285
291
{
@@ -333,6 +339,15 @@ Visit the [`edsnlp.train` documentation][edsnlp.training.trainer.train] for a li
333
339
)
334
340
```
335
341
342
+
1. Wait, what's does "draft" mean here ? We'll let the train function pass the nlp object
343
+
to the optimizer after it has been been `post_init`'ed : `post_init` is the operation that
344
+
looks at some data, finds how many label the model must learn, and updates the model weights
345
+
to have as many heads as there are labels observed in the train data. This function will be
346
+
called by `train`, so the optimizer should be defined *after*, when the model parameter
347
+
tensors are final. To do that, instead of instantiating the optimizer right now, we create
348
+
a "Draft", which will be instantiated inside the `train` function, once all the required
0 commit comments