|
|
|
|
|
by joshdavham
612 days ago
|
|
> Can I really understand 75% of text if I have perfect recall of those 800 words? This thing you're talking about is called 'word coverage'. It's the percentage of words you know in a given text. I've created lots of word coverage graphs in the past, and, as research has shown, you won't really be understanding much until you reach the high 90s in terms of word coverage. The famous number for being able to read English texts extensively requires a word coverage of around 98%. And while it depends on the text, in order to reach 98%, you generally need to know around the top 5k words in a language. Funny enough, when you understand 75% of the words in a text, you subjectively feel like you're understanding like 10% of what's going on. |
|
The coverage required in Japanese (my target language) seems something like the most frequent 15,000 words (depending on the definition of word) are required for 98% coverage. At 12,000 words it becomes viable to read with some comprehension and semi-frequent dictionary lookups.
Also, interestingly, you need about 2x the number of words in Japanese as English to reach 87% coverage:
"It has been reported that 2,000 high-frequent English words cover 87% of tokens (Nation, 1990). In case of Japanese, 4,024 SUWs are required to cover 87% of tokens." (Text Readability and Word Distribution in Japanese, Satoshi Sato)