-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for MasakhaPOS Dataset #3247
Conversation
/cc @dadelani |
@alanakbik Please let me know, if dataset name is ok: it does not quite match into the |
@stefan-it , the dataset name is MasakhaPOS, arXiv paper will be out tomorrow |
Thanks @dadelani for feedback, I corrected the dataset name now :) |
Preprint is now available here 🤗 |
Hi @helpmefindaname do you accidentally have an idea, why https://github.com/flairNLP/flair/actions/runs/5062453919/jobs/9099744358?pr=3247 This was the already the case yesterday, I've just re-ran the build, but same error. |
The dependency resolution took needlessly long, as it tried out all 300+ |
After #3258 I will do a rebase now :) |
3515ba9
to
07630c5
Compare
ac07402
to
47ca73c
Compare
84b4f75
to
74f8602
Compare
… currently missing and luo + tsn are missing
74f8602
to
5bd4526
Compare
@stefan-it thanks for adding this! And thanks @dadelani for creating this dataset! |
Hi,
this PR adds support for the recently proposed MasakhaPOS Dataset.
Details can be found in this tweet.
The dataset is available in this repo: https://github.com/masakhane-io/masakhane-pos
I received preprint of the paper and wrote unit tests to check number of parsed sentences for dataset splits for each language.
Example usage of MasakhaPOS: