| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by brigadier132 880 days ago

This analysis is bad.

The embedding is generated once. Search is done whenever a user inputs a query. The cosine similarity is also not done on a single embedding, it's done on millions or billions of embeddings if you are not using an index. So what the actual conclusion is, is that once you have a billion embeddings a single search operation costs as much as generating an embedding.

But then, you are not even taking into account the massive cost of keeping all of these embeddings in memory ready to be searched.

1 comments

_ea1k 880 days ago

I think the context was prototyping.

link

gdiamos 880 days ago

Prototyping is one scenario I have seen this in. Prototyping is iterative - you experiment with the chunk size, chunk content, data sources, data pipeline, etc. every change means regenerating the embeddings

Another one is where the data is sliced based on a key, eg user id, particular document being worked on right now, etc

link