-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make gensim optional #3493
make gensim optional #3493
Conversation
4c25775
to
fb60047
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this @helpmefindaname. This generally looks good.
I tested loading our standard 'ner' model that I re-serialized and pushed to the hub as "alanakbik/ner-new".
I then tested loading with the following code:
from flair.data import Sentence
from flair.models import SequenceTagger
model = SequenceTagger.load("alanakbik/ner-new")
sentence = Sentence("Bill was born in New York")
model.predict(sentence)
print(sentence)
That works, but only for newer flair versions. It starts breaking down from Flair 0.10.0.
To reproduce, in a fresh env, do:
pip install flair==0.10.0
and run the above code.
This throws the error:
AttributeError: 'dict' object has no attribute 'embedding_length'
I think this actually has nothing directly to do with this PR, but affects all new models that were trained with newer Flair versions and are loaded with older ones.
But to deploy this PR, we'd need to update all models. So if people are still using an old version of Flair, the regular 'ner' model would no longer work.
@helpmefindaname can you take a look if backward compatibility can be improved?
@helpmefindaname thanks a lot for adding this! I will now update all models on HF. |
The pip command recommended by the gensim import error message didn't work out of the box for me. I use Terminal and zsh on macOS. I had to put quotes around "flair[word-embeddings]" when using pip install (like so: |
Closes #3482
This requires all public models to be in serialized format, so they can be loaded without attempt to load gensim.
Now you can use
WordEmbeddings
,BytePairEmbeddings
in inference without having gensim/bpe installed.FasttextEmbeddings
andMuseEmbeddings
will only work when gensim is installed, I find this justifyable, as those embeddings are not commonly used anymore.When instanciating new
WordEmbeddings
orBytePairEmbeddings
, gensim/bpe is required. You can install them withpip install flair[word-embeddings]
after the next release orpip install -e .[word-embeddings]
when developing.