Skip to content

Commit

Permalink
Merge branch 'master' into qlora
Browse files Browse the repository at this point in the history
  • Loading branch information
janpf authored Jul 15, 2024
2 parents 3d4e785 + 08b45e9 commit 6bcc677
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ torch.Size([1536])
torch.Size([9984])
```

I.e. the size of the embedding increases the mode layers we use (but ONLY if layer_mean is set to False, otherwise the length is always the same).
I.e. the size of the embedding increases the more layers we use (but ONLY if layer_mean is set to False, otherwise the length is always the same).

(pooling)=
### Pooling operation
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ trainer.train('resources/taggers/example-upos',
max_epochs=10)
```

This will launch a "standard training run" with SGD as optimizer. By default, the learning rate is annealed against the development score: if fo 3 epochs there is no improvement on the dev split, the learning rate is halved. If this happens too often, the learning rate will fall below a minimal threshold and training stops early.
This will launch a "standard training run" with SGD as optimizer. By default, the learning rate is annealed against the development score: if for 3 epochs there is no improvement on the dev split, the learning rate is halved. If this happens too often, the learning rate will fall below a minimal threshold and training stops early.

The max_epochs parameter is set to a small number in this script to make it run fast, but normally you should use a much higher value (150 or 200).

Expand Down

0 comments on commit 6bcc677

Please sign in to comment.