|
|
|
|
|
by qq99
398 days ago
|
|
I was wondering about this. I was hesitant to add embedding-based search to my app because I didn't want to incur the latency to the embedding API provider blocking every search on initial render. Granted, you can cache the embeddings for common searches. OTOH, I also don't want to render something without them, perform the embedding async, and then have to reify the results list once the embedding arrives. Seems hard to sensibly do that from a UX perspective. To render locally, you need access to the model right? I just wonder how good those embeddings will be compared to those from OpenAI/Google/etc in terms of semantic search. I do like the free/instant aspect though |
|
I've had a particularly good experiences with nomic, bge, gte, and all-MiniLM-L6-v2. All are hundreds of MB (except all-minilm which is like 87MB)