Hacker News new | ask | show | jobs
by benjamincburns 4778 days ago
> Because algorithmic sentiment analysis would automatically classify any tweet containing 'hate words' as "negative," this project relied upon the HSU students to read the entirety of tweet and classify it as positive, neutral or negative based on a predefined rubric. Only those tweets that were identified by human readers as negative were used in this analysis.

I wonder how well a Bayesian classifier would work if the this was used as a training set. If it worked relatively well, there's no reason why you couldn't create a live version of the map.

Something like http://aworldoftweets.frogdesign.com/ maybe?

1 comments

Not very well. Twitter sentiment is a difficult problem.

Consider using millions of training examples (vs. thousands). This was done as part of the "distant supervision" Twitter sentiment technique. What this means is that tweets with positive emoticons were labeled as positive sentiment, and negative emoticons were labeled as having negative sentiment. Emoticons were stripped before training. This system got 80% accuracy.

http://cs.wmich.edu/~tllake/fileshare/TwitterDistantSupervis...