| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wodenokoto 1755 days ago
	What would be the correct way of going about assessing statistical significance of these frequencies? Like if we assumed that all English language is generated from a weighted distribution of all words and “the” is 3.5%, is a 4.3% occurrence rate even significant? (And what even would be the base occurrence rate?)