Skip to content

ModuleNotFoundError: No module named 'spacy.lang.tokenizer' #211

@LeJudith

Description

@LeJudith

Hi again!

I successfully migrated the pretrained a MedCAT v1 UMLS Dutch (v1.10) model into MedCAT v2, following the official migration notebook.
The base migrated model loads and works perfectly fine.

However, after fine-tuning it using annotations from MedCATTrainer and saving the model pack, I can no longer reload it.
When I attempt to load the saved fine-tuned model pack, I get this error:


ModuleNotFoundError: No module named 'spacy.lang.tokenizer'

The above exception was the direct cause of the following exception:

ImportError                               Traceback (most recent call last)
Cell In[5], [line 1](vscode-notebook-cell:?execution_count=5&line=1)
...
--> [387](https://vscode-remote+ssh-002dremote-002boaks-002ddebug.vscode-resource.vscode-cdn.net/home/user/venvs/prism/lib/python3.10/site-packages/spacy/util.py:387)             raise ImportError(Errors.E048.format(lang=lang, err=err)) from err
    388     set_lang_class(lang, getattr(module, module.__all__[0]))  # type: ignore[attr-defined]
    389 return registry.languages.get(lang)

ImportError: [E048] Can't import language tokenizer or any matching language from spacy.lang: No module named 'spacy.lang.tokenizer'

This issue only occurs with the Dutch model , not with English SNOMED models fine-tuned in the exact same way.

Is this a known issue and is there a workaround for this?

Thanks already in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions