Making gradient clipping optional & max gradient norm variable #1929
Annotations
3 errors
test:
flair/trainers/trainer.py#L341
ruff
pytest_ruff.RuffError: flair/trainers/trainer.py:348:56: W291 [*] Trailing whitespace
|
346 | monitor_train_sample: Set this to evaluate on a sample of the train data at the end of each epoch.
347 | If you set an int, it will sample this many sentences to evaluate on. If you set a float, it will sample
348 | a percentage of data points from train.
| ^^^^^^^^^^^^^^^^ W291
349 | max_grad_norm (Optional[float]): If not None, gradients are clipped to this value before an optimizer.step is
350 | called.
|
= help: Remove trailing whitespace
flair/trainers/trainer.py:350:24: W291 [*] Trailing whitespace
|
348 | a percentage of data points from train.
349 | max_grad_norm (Optional[float]): If not None, gradients are clipped to this value before an optimizer.step is
350 | called.
| ^^^^ W291
351 | use_final_model_for_eval (bool): If True, the final model is used for the final evaluation. If False, the
352 | model from the best epoch as determined by main_evaluation_metric is used for the final evaluation.
|
= help: Remove trailing whitespace
flair/trainers/trainer.py:599:52: W291 [*] Trailing whitespace
|
598 | # do the optimizer step
599 | scaler.unscale_(self.optimizer)
| ^^^^^^^^^^^^^^^^^^^^ W291
600 | if max_grad_norm is not None:
601 | torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_grad_norm)
|
= help: Remove trailing whitespace
|
test:
flair/trainers/trainer.py#L1
Black format check
--- /home/runner/work/flair/flair/flair/trainers/trainer.py 2023-08-08 14:46:53.706293 +0000
+++ /home/runner/work/flair/flair/flair/trainers/trainer.py 2023-08-08 14:52:06.796250 +0000
@@ -343,13 +343,13 @@
train_with_test (bool): If True, the data from test split is added to the training data
main_evaluation_metric: The metric to optimize (often micro-average or macro-average F1-score, or accuracy)
monitor_test (bool): If True, test data is evaluated at end of each epoch
monitor_train_sample: Set this to evaluate on a sample of the train data at the end of each epoch.
If you set an int, it will sample this many sentences to evaluate on. If you set a float, it will sample
- a percentage of data points from train.
+ a percentage of data points from train.
max_grad_norm (Optional[float]): If not None, gradients are clipped to this value before an optimizer.step is
- called.
+ called.
use_final_model_for_eval (bool): If True, the final model is used for the final evaluation. If False, the
model from the best epoch as determined by main_evaluation_metric is used for the final evaluation.
gold_label_dictionary_for_eval: Set to force evaluation to use a particular label dictionary
exclude_labels: Optionally define a list of labels to exclude from the evaluation
sampler: You can pass a data sampler here for special sampling of data.
@@ -594,11 +594,11 @@
store_embeddings(batch_step, embeddings_storage_mode, dynamic_embeddings)
self.dispatch("before_training_optimizer_step", **batch_kw)
# do the optimizer step
- scaler.unscale_(self.optimizer)
+ scaler.unscale_(self.optimizer)
if max_grad_norm is not None:
torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_grad_norm)
scale_before = scaler.get_scale()
scaler.step(self.optimizer)
scaler.update()
|
test
Process completed with exit code 1.
|