| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by forrestp 392 days ago

My understanding is that your levers are roughly better / more diverse embeddings or computing more embeddings (embed chunks / groups / etc) + aggregating more cosine similarities / scores. More flops = better search w/ steep diminishing returns

Colbert being a good google-able application of utilizing more embeddings.

Search ends up often being a funnel of techniques. Cheap and high recall for phase 1 and ratchet up the flops and precision in subsequent passes on the previous result set.

1 comments

0101111101 392 days ago

Exactly! A near property of the matryoshka embeddings is that you can compute a low dimension embedding similarity really fast and then refine afterwards.

link