Hacker News new | ask | show | jobs
What I found When I analysis million followers of President Trump with nlp (insightninja.net)
14 points by plantpark 3060 days ago
3 comments

Interesting content but also second the spellcheck point when you are posting about natural language. Also, pie charts are horrible for things with a ton of data points-the second graph kind of pulls it off because "en" is so large but can't make much out of the first graph. Third, while Five Thirty Eight is certainly well known, they definitely make mistakes, as was seen in their complete miss in prediction of his presidency. They are no better than Rasmussen who currently holds Trump at a split 49% approval rate, may want to add them as another source to better balance your fact statements.

http://www.rasmussenreports.com/public_content/politics/poli...

"complete miss"? They put his probability of winning somewhere between a quarter and a half -- rather uniquely, among a field of predictors who on average put his probability of winning under 10%.

It sounds like because he won, you believe they should have given him a probability greater than 1/2. That represents a misunderstanding of what probability means.

The only people who think Five Thirty Eight had a complete miss in predicting a Trump victory are those who do not have an understanding of probability, statistics, or predictive models.
Agreed charts are supposed to make understanding the data easier. This may do the opposite.
I thought Pie chart will make the percentage more clearly. Do you have any suggestions for the chart? Thanks!
What kind of chart do you think is better for such data? Looking forward to your advice!
Thanks, I will check the source for more details.
Why the need for machine learning for the second part? It seems like a complicated way to do what you could do with some simple database queries.
It's not about some simple word frequencies of words. Some common words like "like" "need" "second" "part" in the dataset of whole documents isn't so meaningful in a specific sentence. Google "tf-idf" will show you more details about this.
Ok, I looked that up. Again isn't this something that Elasticsearch would do without needing to set up a machine learning system?
good content, but run your text through spell-check before posting, and/or send it to someone to proofread.
Sorry for that, I will check it again.
I understood it without any significant issues.