| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lunarmony 297 days ago
	Researchers have discussed limitations of vector-based retrieval from a rank perspective in various forms for a few years. It's further been shown that better alternative exists; some low-rank approaches can theoretically approximate arbitrary high-rank distribution while permitting MIPS-level efficient inference (see e.g., Retrieval with Learned Similarities, https://arxiv.org/abs/2407.15462). Such solutions are already being used in production at Meta and at LinkedIn.

1 comments

yorwba 297 days ago

I don't think Mixture of Logits from the paper you link circumvents the theoretical limitations pointed out here, since their dataset size mostly stays well below the limit.

In the end they still rely on Maximum Inner Product Search, just with several lookups for smaller partitions of the full embedding, and the largest dataset is Books, where this paper suggests you'd need more than 512 embedding dimensions, and MoL with 256-dimensional embeddings split into 8 parts of 32 each has an abysmal hit rate.

So that's hardly a demonstration that arbitrary high-rank distributions can be approximated well. MoL seems to approximate it better than other approaches, but all of them are clearly hampered by the small embedding size.

lunarmony 294 days ago

Mixture of Logits was actually already deployed on 100M scale+ datasets at Meta and at LinkedIn (https://arxiv.org/abs/2306.04039 https://arxiv.org/abs/2407.13218 etc.). The crucial departure from traditional embedding/multi-embedding approaches is in learning a query-/item- dependent gating function, which enables MoL to become a universal high-rank approximator (assuming we care about recall@1) even when the input embeddings are low rank.