Using pre-trained models

Hello, thank you for the great work. 
I used this script to run the pre-training for MLM task: https://github.com/shabie/docformer/blob/master/examples/docformer_pl/3_Pre_training_DocFormer_Task_MLM_Task.ipynb
Afterwards, I used the resulting model in the token-classification task. ( using load_from_check_point which copies all the weight except the linear layer which has a different shape).

The problem is that no matter how much I run the pre-training, I always get the same metrics in the token-classification task (using that pre-trained model as a starting point).

I even tried the model from document-classification task as a base for token classification and I still the get same exact metrics as the results I was getting from using the MLM-pretrained task. 

Any suggestion on how to properly use the pre-trained models?
 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using pre-trained models #46

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Using pre-trained models #46

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions