|
|
|
|
|
by ethanahte
1162 days ago
|
|
Hi, author here. I totally agree with you that, for large scale, you're going to need a vector database. My hope is more to help people avoid scenarios like the one in this comment: https://news.ycombinator.com/item?id=35552303 Tangentially, I really like the approach that haystack has taken, where they allow you to slot in whichever document store you want, and that document store can scale from in-memory, to sqlite, to postgres, to pinecone https://docs.haystack.deepset.ai/docs/document_store In terms of the one-time cost of indexing, you're totally right! Although, one thing to call out is that you will have to re-index every time you change your embedding model, such as for fine-tuning. I don't have a good handle on how prevalent this is, though. |
|