Hacker News new | ask | show | jobs
by dinobones 1528 days ago
How FAANG actually builds their recommendation systems:

Millions of cores of compute, exabyte scale custom data stores. Good recommendations are expensive. If you try to build a similar system on AWS, you will spend a fortune.

Most recommender models just use co-occurrence as a seed, this can actually work pretty well on it’s own. If you want to get fancy then build up a vectorized form of the document with something like an an autoencoder, then use some approximate nearest neighbors to find documents close by. 95% of the compute and storage is just spent on calculating co-occurrence though.

1 comments

> Millions of cores of compute, exabyte scale custom data stores. Good recommendations are expensive. If you try to build a similar system on AWS, you will spend a fortune.

And then it will be gamed, and become as useless as every other recommendation system already going.

Also, 'millions of cores' is a ludicrously shitty, zero-clue answer. It's like asking how Eminem makes music, and saying 'millions of pills'. Like, yes, that's an input, but you're missing the entire method of creation, of converting the crude inputs into the outputs.

For my money - and, for what little it's worth, I work in this field – I think most of the impressive feats of data science attributed to 'machine learning' are really just a function of now having hardware capacity so insanely great that we're able to 'make the map the size of the territory', so to speak. These models are essentially overfitting machines, but that's OK when (a) it's an interpolation problem and (b) your model can just memorise the entire input space (and deal with any inaccuracies by regularisation, oversampling, tweaking parameters till you get the right answers on the validation set, then talking about how 'double descent' is a miracle of mathematics, etc).

Don't get me wrong, neural nets are obviously not rubbish. They are a very good method for non-convex, non-differentiable optimisation problems, especially interpolation. (And I'm grateful for the hype cycle that's let me buy up cheap TPUs from Google and hack on their instruction set to code up linear algebraic ops, but for way more efficient optimisation methods, and also in Rust, lol.) It's just a far more nuanced story than "this method we discovered and hyped up for a decade in the 80s suddenly became the key to AGI".