Hacker News new | ask | show | jobs
by TekMol 3267 days ago
Statistics without info about sample size and why the data is supposed to be significant and not just random noise are pretty much worthless.

And the wordlist ... simply impossible. One of the 50 most common words is "auditorium". And "the" and "a" are not even in the list.

2 comments

Also, the amount of tweets is show if you hover over the charts and on the side bar next to the charts like most graphs.
37900 collected in nine days.

In NLP, words like "the" and "a" are "stop words" that are excluded from word counts.