Hacker News new | ask | show | jobs
by waqf 5593 days ago
He is right to normalize the results, but parent's point is that he is wrong to do that by modifying his data collection.

He should just collect as many commit messages as possible, then divide the profanity count for each language by the commit message count. Because that has lower standard error [and no more bias] than what he did.