| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wanderingmind 982 days ago

Are there any good implementations of using RAG within postgresql ecosystem? I have seen blogposts from supabase[0] and timescale db[1] but not a full fledged project. The full text search is very good within postgres at the moment and having semantic search within the same ecosystem is quiet helpful atleast for simple usecases.

[0] https://supabase.com/docs/guides/database/extensions/pgvecto...

[1] https://www.timescale.com/blog/postgresql-as-a-vector-databa...

3 comments

losteric 982 days ago

Isn't RAG "just" dynamically injecting relevant text in a prompt? What more would one implement to achieve RAG, beyond using Postgres' built in full text or knn search?

link

wanderingmind 982 days ago

what i'm looking for is a neat python library (or equivalent) that integrates end to end say with postgres/pgvector using sqlalchemy, enables parallel processing of large number of documents, create interfaces for embeddings using openai/ollama etc. It looks like FastRAG [0] from intel looks close to what i'm envisioning but it doesnt appear to have integration to postgres ecosystem yet i guess.

[0] https://github.com/IntelLabs/fastRAG

link

ddematheu 982 days ago

Through the platform (Neum AI) we support the ability to do this with Postgres, it is just a cloud platform so not a python library.

Curious on what type of customization are you looking to add that you would want something like a library?

link

wanderingmind 981 days ago

We need something we can orchestrate and control locally and be able make changes if need be. The GUI based interface is good for more mature workflows but our workflows are constantly evolving and requires tweaking that its hard to do with GUI and web interface

link

avthar 981 days ago

Timescale recently released Timescale Vector [0] a scalable search index (DiskANN) and efficient time-based vector search, in addition to all capabilities of pgvector and vanilla PostgreSQL. We plan to add the document processing and embedding creation capabilities you discuss into our Python client library [1] next, but Timescale Vector integrates with LangChain and LlamaIndex today [2], which both have document chunking and embedding creation capabilities. (I work on Timescale Vector)

[0]: https://www.timescale.com/blog/how-we-made-postgresql-the-be... [1]: https://github.com/timescale/python-vector [2]: https://www.timescale.com/ai/#resources

link

antupis 982 days ago

Or generally what are good vector dbs have tried LlaMaindex, pinecone and milvus but all kinda sucked different way.

link

ddematheu 982 days ago

What about then sucked?

link