Hacker News new | ask | show | jobs
How we scaled to generate personalities for 200 million people (vasir.net)
29 points by enoex1 4375 days ago
6 comments

Nice notes about managing cache invalidation! Interested in seeing where you guys go with your product.
Very interesting. I like the idea of putting everything on one server so you can just spin more up.
Nice insights from a technical standpoint, though I'm more interested in the machine learning aspect. Was it dictionary based? How does the system account for sarcasm or the billion+ meme/BuzzFeed posts?
I'll be posting a follow up about the machine learning bit in the near future. It uses not just words, but also phrases. For the meme / buzzfeed posts, more weight is given to content you write vs. links / articles you post (and we only take into account what you say if you do share a link, not the content the buzzfeed post itself).

It doesn't really try to distinguish sarcasm. Depending on the sample size (ours used 75k people with ~750m words / phrases), it could conceivably detect sarcasm. Yeah, totally. /s (Maybe, but probably not)

The study itself is published at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783449/

Posts like this is what I like about Hacker News! It would be great if there were other examples like this where organizations share their setup for real-world projects and presences.
Great post - nice layout of the process used. Thanks for the in-depth look at what you did to make it work on such a large scale.
Will you be posting any statistics on users personalities? (Anonymous of course)

P.s. Your name is hazardous bra