Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using on Windows: IndexTableReadingException #56

Open
Mukhammadsaid19 opened this issue May 18, 2022 · 10 comments
Open

Using on Windows: IndexTableReadingException #56

Mukhammadsaid19 opened this issue May 18, 2022 · 10 comments

Comments

@Mukhammadsaid19
Copy link

When I try to build hfst-ospell in Windows and use it with FST models, it throws IndexTableReadingException in function void IndexTable::read(FILE *f, TransitionTableIndex number_of_table_entries):

image

The same bug persists when using pre-compiled executable listed in the Apertium website:

image

Lexicon and error model files work well on Linux Ubuntu. What might be a problem?

@snomos
Copy link
Member

snomos commented May 18, 2022

Not an answer to your question, but possibly a working alternative: https://github.com/divvun/divvunspell. It is a Rust implementation of hfst-ospell, and considerably faster.

@TinoDidriksen
Copy link
Member

The native Windows binaries are not a priority because WSL works so perfectly, so nobody tests on native Windows.

@Mukhammadsaid19
Copy link
Author

I would like to use it inside C# (for VSTO Microsoft Word Add-In) and I planned to bind it through DDL file. Initially I wanted to use voikko but it has many features that differ from the Uzbek language, so I decided to start from scratch.

@snomos Interesting, I will check it out, thank you!
@TinoDidriksen Anyone used hfst-ospell and its dependents for MS Office Add-Ins?

@TinoDidriksen
Copy link
Member

Anyone used hfst-ospell and its dependents for MS Office Add-Ins?

Yes, I do that. Divvun also does that. But it's not the correct way any longer. VSTO extensions are headed to the scrap heap because they can't run on macOS, iPad, or web editions. Instead we have moved to Office.js add-ins that work cross-platform:

What language are you trying to add a checker for?

@Mukhammadsaid19
Copy link
Author

Initially, I have tried to make Office.js add-in in Angular, but its API was a little restricted (I couldn't draw red lines using Windows Forms), so I decided to stick with C#. I remember that there was .js web-assembly of voikko. Hm... I will definitely check out these spellcheckers you suggested. Perhaps I am not on the right track.

What language are you trying to add a checker for?

The language I want to add is Uzbek, agglutinative language from Turkic family with 36 mln of speakers. It is similar to Turkish, but with simpler morphophonemics. I used foma and hfst to compile the morph analyzer, it recognizes around 99% of Uzbek words. In fact, there are many Turkic languages which don't have reliable spellcheckers: Kazakh, Kyrgyz, Turkmen, Uyghur etc.

P.S. Using hfst-ospell I recently made a simple soft keyboard for Uzbek called Tahrirchi, I used hfst-ospell and added a couple of algorithms to handle mobile input. However, it turned out to be much difficult task than I expected with its low-memory requirements and abundance of features offered by GBoard or Samsung keyboards (they also use FSTs, but in the context of HMMs). Have you happened to work with spellchecking in the soft keyboards?

@TinoDidriksen
Copy link
Member

We are painfully aware that Office.js is limited (and I've reported it upstream, twice), but it's still the only future-proof and cross-platform solution.

Divvun also makes keyboards. There's a whole pipeline for turning an FST into spellers, keyboards, and prediction, all for both desktop and mobile. @snomos can point you at docs. As for Uzbek, you may also be interested in https://github.com/apertium/apertium-uzb

Btw, we are on IRC on irc.oftc.net channels #hfst and #apertium

@snomos
Copy link
Member

snomos commented May 18, 2022

https://github.com/divvun/kbdgen2 takes a (relatively) simple yaml file (wrapped in a bundle with some metadata) as input, and produces keyboard packages for iOS, Android, Linux, Windows, macOS and ChromeOS. For iOS and Android, the keyboards can be bundled with Hfst-based spellers. kbdgen2 is still work in progress. Other repos relevant to keyboards and spelling checkers are:

@Mukhammadsaid19
Copy link
Author

@TinoDidriksen I have checked out the Office apps you suggested, I think I would also go with that UI/UX. However, the coverage of Internet in Uzbekistan is poor, not everyone has constant access to it. So, I was thinking about creating a web-assembly version of hfst-ospell and integrating it into the extension and make it free and offline. Do you think it is ok?

@snomos Thank you! I will definitely check it out!

@TinoDidriksen
Copy link
Member

The nightly Windows builds got updated last week, so check if latest binary still throws.

@jzr-supove
Copy link

Was facing the same issue. In my case, opening a file in "rb" mode instead of "r", solved the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants