| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by minimaxir 970 days ago

If you want to experiment with vector stores, you can do that locally with something like faiss which has good multiplatform support and sufficient tutorials: https://github.com/facebookresearch/faiss

Doing full retrieval-augmented generation (RAG) and getting LLMs to interpret the results has more steps but you get a lot of flexibility, and despite what AI influencers say there's no standard best-practice. When you query a vector DB you get the most similar texts back (or an index integer in the case of faiss), you then feed those result to an LLM like a normal prompt, which can be optimized with prompt engineering.

The codifer for the RAG workflow is LangChain, but their demo is substantially more complex and harder-to-use than even a homegrown implementation: https://minimaxir.com/2023/07/langchain-problem/

1 comments

marcyb5st 970 days ago

Also, if what you look up has no semantic meaning like parts number you might be better off with an inverted index in addition to ANN lookups. Especially if the embedding model has been trained on a dataset that is not similar to what you use it for. That's a common situation right now with embedding models based on LLMs.

link