Nice insights from a technical standpoint, though I'm more interested in the machine learning aspect. Was it dictionary based? How does the system account for sarcasm or the billion+ meme/BuzzFeed posts?
I'll be posting a follow up about the machine learning bit in the near future. It uses not just words, but also phrases. For the meme / buzzfeed posts, more weight is given to content you write vs. links / articles you post (and we only take into account what you say if you do share a link, not the content the buzzfeed post itself).
It doesn't really try to distinguish sarcasm. Depending on the sample size (ours used 75k people with ~750m words / phrases), it could conceivably detect sarcasm. Yeah, totally. /s (Maybe, but probably not)
Posts like this is what I like about Hacker News! It would be great if there were other examples like this where organizations share their setup for real-world projects and presences.