Hacker News new | ask | show | jobs
by lawfulfalafel 5603 days ago
Dude, no. I think he was talking about how you can't tell how the size of the user base of a language is affecting the ranking. So, for example, only 1% of all projects could be in Java, but the swearing could be frequent enough to make it have ~15% of all curse words.
1 comments

> Note that I ripped an equal amount of commit messages per language so the results aren't based on how many projects there are per language.

All the languages are equally represented by commit count.

but his total number is 929857, which is not divisible by 8
What a bummer... The percentages might be off by a fraction of a fraction of a percent...
I see no reason to believe that, given his process for ripping an "equal" number of commit messages per language was broken, that anything else even approaches validity. It's simple arithmetic; a grade schooler who notices that the last number is 7 would realize something's off.
What about the process is broken? Did you read the code and find bugs? With a total commit count of 929857 missing a single commit to round out to a perfectly even number of commits in each language is insignificant.
Or he had 929857 commit's and then he randomly sampled an equal number for each language. Thus, no division etc.
Sorry, my bad.