Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add BINDER model #3303

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

add BINDER model #3303

wants to merge 2 commits into from

Conversation

whoisjones
Copy link
Member

Closes #3292.

PR adds the BINDER Model (Paper) and can be executed with following script:

import flair
from flair.datasets import CONLL_03
from flair.embeddings import TransformerWordEmbeddings, TransformerDocumentEmbeddings
from flair.models import BinderModel
from flair.trainers import ModelTrainer

flair.set_seed(42)

corpus = CONLL_03().downsample(0.2)
print(corpus)

label_type = 'ner'
label_dict = corpus.make_label_dictionary(label_type=label_type, add_unk=False)
print(label_dict)

token_embeddings = TransformerWordEmbeddings(
    model='bert-base-uncased',
    fine_tune=True,
    subtoken_pooling="first_last",
    is_document_embedding=True
)

label_embeddings = TransformerDocumentEmbeddings(model='bert-base-uncased', fine_tune=True)

model = BinderModel(
    token_encoder=token_embeddings,
    label_encoder=label_embeddings,
    label_dictionary=label_dict,
    label_type=label_type
)

# 6. initialize trainer
trainer = ModelTrainer(model, corpus)

# 7. run fine-tuning
trainer.fine_tune('resources/taggers/binder-conll',
                  learning_rate=5.0e-5,
                  mini_batch_size=4,
                  mini_batch_chunk_size=4,  # remove this parameter to speed up computation if you have a big GPU
                  )

I still have the following hurdles compared to the original implementation:

  • Slower training (1 epoch CoNLL takes about 6 mins, the original implementation around 1-2 min), and I did not find the bottleneck yet. I can think of the following possible reason: Original implementation utilizes the entire sequence length of the model by concatenating sentences to a long string. I compute the start/end logic of spans in the model rather than in the dataset.
  • Not integrated into DefaultClassifier - which would be nice but impossible due to different constructors.

Copy link

stale bot commented Mar 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Mar 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: add BINDER
1 participant