|
|
|
|
|
by avthar
593 days ago
|
|
Hey HN! Post co-author here, excited to share our new open-source PostgreSQL tool that re-imagines vector embeddings as database indexes. It's not literally an index but it functions like one to update embeddings as source data gets added, deleted or changed. Right now the system only supports OpenAI as an embedding provider, but we plan to extend with local and OSS model support soon. Eager to hear your feedback and reactions. If you'd like to leave an issue or better yet a PR, you can do so here [1] [1]: https://github.com/timescale/pgai |
|
One question - in the RAG projects we've done, most of the source data was scattered in various source systems, but wasn't necessarily imported into a single DB or Data Lake. For example, building an internal Q&A tool for a company that has knowledge stored in services like Zendesk, Google Drive, an internal company Wiki, etc.
In those cases, it made sense to not import the source documents, or only import metadata about them, and keep the embeddings in a dedicated Vector DB. This seems to me to be a fairly common use case - most enterprises have this kind of data scattered across various systems.
How do you envision this kind of use case working with this tool? I may have missed it, but you mention things like working with images, etc, is your assumption that everyone is storing all of that data in Postgres?