| HN Mirror

I view ANN search for retrieval as clear winner in retrieval methods for domains with very large number of items like O(10 million +) that you want to high amount searches. I've worked at a couple different major tech companies and I'd consider two tower models + ANN as a classic pair. One tower for request embedding called on every request. One tower for item embedding. Compute all item embeddings periodically and build an ANN index. The top dot product of the two with minor constraints can be done by just running request tower and then doing search in the index.

The speed up is really necessity as direct methods are just too expensive both dollar wise and time wise (lowish latency is goal).