How we scaled to generate personalities for 200 million people

Y	Hacker News new \| ask \| show \| jobs

	How we scaled to generate personalities for 200 million people (vasir.net)
	29 points by enoex1 4375 days ago

6 comments

tulpa 4375 days ago

Nice notes about managing cache invalidation! Interested in seeing where you guys go with your product.

link

seanalair 4375 days ago

Very interesting. I like the idea of putting everything on one server so you can just spin more up.

link

tannerc 4375 days ago

Nice insights from a technical standpoint, though I'm more interested in the machine learning aspect. Was it dictionary based? How does the system account for sarcasm or the billion+ meme/BuzzFeed posts?

link

enoex1 4375 days ago

I'll be posting a follow up about the machine learning bit in the near future. It uses not just words, but also phrases. For the meme / buzzfeed posts, more weight is given to content you write vs. links / articles you post (and we only take into account what you say if you do share a link, not the content the buzzfeed post itself).

It doesn't really try to distinguish sarcasm. Depending on the sample size (ours used 75k people with ~750m words / phrases), it could conceivably detect sarcasm. Yeah, totally. /s (Maybe, but probably not)

The study itself is published at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783449/

link

poezn 4375 days ago

Posts like this is what I like about Hacker News! It would be great if there were other examples like this where organizations share their setup for real-world projects and presences.

link

applecart 4375 days ago

Great post - nice layout of the process used. Thanks for the in-depth look at what you did to make it work on such a large scale.

link

popwarsweet 4375 days ago

Will you be posting any statistics on users personalities? (Anonymous of course)

P.s. Your name is hazardous bra

link