|
|
|
|
|
by wodenokoto
1755 days ago
|
|
What would be the correct way of going about assessing statistical significance of these frequencies? Like if we assumed that all English language is generated from a weighted distribution of all words and “the” is 3.5%, is a 4.3% occurrence rate even significant? (And what even would be the base occurrence rate?) |
|