-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The bokmål mode uses many non-bokmal words #1
Comments
Hei! Thanks for the report -- this is definitely an ongoing concern :) The word list comes from Nasjonalbibliotekets ordbank, which contains every possible word -- including ones which see little (or no) actual usage. They are, however, technically correct, hehe. Personally, I see this as a good thing, as I discover words that are new to me -- and sometimes to my Norwegian friends, too! It makes me think about the construct of the language and how the word is valid, what its roots are, etc. I wouldn't be opposed to adding a "bokmål (simplified)" dictionary that was limited to, say, the top 1,000 words or something. However, I haven't found such a list (note: we need a sufficiently-large corpus after the 5-letter restriction is applied). If you're able to find one, please let me know! Regarding #4081 (OIETE) - it actually is a word, and can be found referenced in UIB's dictionary. Specifically, it appears to be a slang interjection of sorts. Perhaps I should switch dictionaries, instead 😉 |
I think that the clue to wordle's success is that it has a manually curated list of possible answers. This is clearly a deliberate choice, since the list of words the interface will accept is much larger. Otherwise you get a lot of "scrabble words", the sort that are used more in word games like this than in their original contexts. Learning new words is well and good, but if your clue is xOUNS you'd feel a little cheated if the answer was louns, bouns or touns rather than nouns. |
Wordle does have a smaller dictionary of well known words that can be the target, it was hand crafted:
We'd need a very bored native speaker to do this, which is a problem. I was thinking in the shower 🚿 what a good way to do this might be, we could use Google Books Ngram to score each word based on its use in literature but they don't have a Norwegian dataset.. 😕 |
Thanks for the lively debate, everyone! That's one part of the problem, @brackendawson -- we don't have a good dataset. However, neither did Wordle. It would be [relatively] straightforward for someone to round up a few native-Norwegian speakers and ask them to go through the list and get it down to a set of "real" words. If someone does such a thing, please let me know and I will happily add a "bokmål (simplified)" (eg) option to the language picker in the app! This brings me back to my personal viewpoint: I like the uncommon/rare/obscure words. I built this game not to achieve mass-market popularity (exhibit A: I didn't buy ordle.no even though it was available when I launched this app). Instead, I built it to discover new (to me!) Norwegian words. Every time I get presented with a word that isn't in the dictionary, I like the process of searching it up on Google to find what it could mean, where it came from, and so forth. As long as I learn something, it's a success to me 😃 |
I love the game, I too have been using it daily to improve my vocab. However after selecting bokmål it sometimes chooses words from dialects, such as game 4081 (oiete), which isn't even in the linked dictionary.
Definitely it's useful to learn dialects but they are above my level for now. Wondering if bokmål mode is using as good a word list as it could?
The text was updated successfully, but these errors were encountered: