| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by krishadi 986 days ago
	Latency from embedding models is still going to be the bottleneck for performance however fast the DB is going to be. Plus adding all the overhead of synthesising answers and summaries from a LLM is going to weigh you down.

1 comments

charcircuit 986 days ago

Embeddings can be precomputed. Imagine a related videos section a video sharing site. Each video's embedding is relatively static.

link

krishadi 985 days ago

If you are building a search engine or a QA bot, the embedding of the query still needs to be calculated. The results do depend on the quality of the model, and if you are using a large on it does take time.

link