You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I could quite appreciate if udpipe indicated somehow that a respective word form was not present in the morphological lexicon, i.e., its lemma, PoS and features have been guessed, This type of information is provided, e.g,, by TreeTagger and we make use of it while post-processing the tagger output, and also provide it to corpus users so that they can incorporate the respective attribute into their CQL queries...
Currently it is not straightforward to implement this, because current UDPipe does not distinguish "real" morphological lexicon and guesser rules derived from the training data. (Our MorphoDiTa tool can do it, there we keep this distinction.)
BTW, if you have a morphological dictionary, you can perform the required operation manually after running UDPipe.
Also, the future UDPipe 2.0 will allow explicitly passing morphological dictionary (during inference, not just during training), so it will then be possible to indicate which words were processed just by a "guesser".
I could quite appreciate if udpipe indicated somehow that a respective word form was not present in the morphological lexicon, i.e., its lemma, PoS and features have been guessed, This type of information is provided, e.g,, by TreeTagger and we make use of it while post-processing the tagger output, and also provide it to corpus users so that they can incorporate the respective attribute into their CQL queries...
Best,
Vlado B, 10:45
http://unesco.uniba.sk/guest/
The text was updated successfully, but these errors were encountered: