-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Hello, thank you for the great work.
I used this script to run the pre-training for MLM task: https://github.com/shabie/docformer/blob/master/examples/docformer_pl/3_Pre_training_DocFormer_Task_MLM_Task.ipynb
Afterwards, I used the resulting model in the token-classification task. ( using load_from_check_point which copies all the weight except the linear layer which has a different shape).
The problem is that no matter how much I run the pre-training, I always get the same metrics in the token-classification task (using that pre-trained model as a starting point).
I even tried the model from document-classification task as a base for token classification and I still the get same exact metrics as the results I was getting from using the MLM-pretrained task.
Any suggestion on how to properly use the pre-trained models?