Skip to content

bezokurepo/models

Repository files navigation

general info on how to run dependency parsers and interpreting the benchmark data for each model card

All released public models from bezoku labs are posted in this repo, unless the weights exceed 25 MB.

For German and any other files that exceed the github limit, email ian.gilmour@bezoku.ai for a copy.

openVINO runners and related files are work in progress. If you cannot get a model to run using the tools provided (e.g. Apple Mac silicon and OS not supportred at present), please contact us for support.

Minimum requirements for inference: torch>=2.0.0 openvino>=2023.0.0 numpy>=1.24.0

The up to date list for languages can be found here - https://github.com/bezokurepo/language-list

How to understand benchmarks in model cards

UPOS Accuracy: (Universal Part of Speech) How well the model predicts UPOS tags (e.g., NOUN, VERB) which are the 17 consistent morphological category predictions, regardless of the language.

XPOS Accuracy: (eXtensible Part of Speech) How well the model predicts language specific Part of Speech morphology. This is harder for languages which are inflected, agglutinative etc due to their complexity and the requirement for (gold standard) annotated data to record the syntactitc information the model needs to learn.

DEPREL Accuracy: (Dependency Relation) How well the model predicts each token for a given dependency relationship label. This syntactic metric depends on the HEAD being correctly predicted.

FEATS Accuracy: (Morphological Features) How well the model predicts the syntactic role of each morpheme. This relates to features such as tenses, gender, numbers, case and is crucial for bezoku model syntactic and head prediction performance.

HEAD UAS Accuracy: (Unlabeled Attachment Score) How well the model predicts tokens that are assigned the correct head (parent) node. If the HEAD prediction is not performing well, by definition the dependency parser will not function effectively.

LAS: (Labeled Attachment Score) How well the model predicts tokens that are assigned the correct HEAD and the correct dependency label. Because LAS measures accuracy for the sentence HEAD and the dependency label, it is generally lower than UAS.

Learn more about compute

Visit DENVR dataworks, where all our models are run - https://www.denvr.com/

deprecation

Regular maintenace and updating will be in operation. Deprecation rules are work in progress, keep in touch if you are forking any models

feedback

If you need support in training our low resource and indigenous language dependency parser for specific tasks (for example, Sentiment Analysis, Grammar Checker, Spell Checker, Domain specific Question & Answer, Document and Internet Search), or deployment with openVINO (infrence via an API or on device) visit https://www.bezoku.tech/service-01

training data

You can find the original conllu files here -> https://github.com/UniversalDependencies, many of which have already been forked to bezokurepo. See the general license.txt and license.txt for each model, which may vary depending on separate licensing / attribution for each corpus. We do our best to keep licensing accurate includng any references. If you spot any mistakes in attribution report them to ian.gilmour@bezoku.ai #roadmap Our roadmap for annotatng data can be found here -> https://github.com/bezokurepo/data

About

Public models from bezoku labs. Compute from DENVR data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages