|
|
|
|
|
by jt_b
479 days ago
|
|
I thought about this some more and did some research - and found an indexing approach using HNSW, serialized to parquet, and queried from the browser here: https://github.com/jasonjmcghee/portable-hnsw Opens up efficient query patterns for larger datasets for RAG projects where you may not have the resources to run an expensive vector database |
|
As others have mentioned in other threads, parquet isn't a great tool for the job here, but you could theoretically build a different file format that lends itself better to the problem of static file(s) representing a vector database.