Nice insights from a technical standpoint, though I'm more interested in the machine learning aspect. Was it dictionary based? How does the system account for sarcasm or the billion+ meme/BuzzFeed posts?
I'll be posting a follow up about the machine learning bit in the near future. It uses not just words, but also phrases. For the meme / buzzfeed posts, more weight is given to content you write vs. links / articles you post (and we only take into account what you say if you do share a link, not the content the buzzfeed post itself).
It doesn't really try to distinguish sarcasm. Depending on the sample size (ours used 75k people with ~750m words / phrases), it could conceivably detect sarcasm. Yeah, totally. /s (Maybe, but probably not)
It doesn't really try to distinguish sarcasm. Depending on the sample size (ours used 75k people with ~750m words / phrases), it could conceivably detect sarcasm. Yeah, totally. /s (Maybe, but probably not)
The study itself is published at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783449/