|
|
|
|
|
by creamyhorror
601 days ago
|
|
Yep, 75% coverage is too low for significant comprehension. You normally need 95% for decent comprehension and 98% for comfortable reading. The coverage required in Japanese (my target language) seems something like the most frequent 15,000 words (depending on the definition of word) are required for 98% coverage. At 12,000 words it becomes viable to read with some comprehension and semi-frequent dictionary lookups. Also, interestingly, you need about 2x the number of words in Japanese as English to reach 87% coverage: "It has been reported that 2,000 high-frequent English words cover 87% of tokens (Nation, 1990). In case of Japanese, 4,024 SUWs are required to cover 87% of tokens." (Text Readability and Word Distribution in Japanese, Satoshi Sato) |
|
You might wanna check out this analysis I did last week: https://cij-analysis.streamlit.app/
I do a little bit of Japanese word coverage analysis in it, among other things.