|
|
|
|
|
by jillesvangurp
632 days ago
|
|
You could do a lot of stuff with pre-calculating things for your embeddings. Why cache when you can pre-calculate. That brings into play a whole lot of things people commonly do as part of ETL. I come from a traditional search back ground. It's quite obvious to me that RAG is a bit of a naive strategy if you limit it to just using vector search with some off the shelf embedding model. Vector search simply isn't that good. You need additional information retrieval strategies if you want to improve the context you provide to the LLM. That is effectively what they are doing here. Microsoft published an interesting paper on graph RAG some time ago where they combine RAG with vector search based on a conceptual graph that they construct from the indexed data using entity extraction. This allows them to pull in contextually relevant information for matching chunks. I have a hunch that you could probably get quite far without doing any vector search at all. It would be a lot cheaper too. Simply use a traditional search engine and some tuned query. The trick is of course query tuning. Which may not work that well for general purpose use cases but it could work for more specialized use cases. |
|
For question answering, vector/semantic search is clearly a better fit in my mind, and I can see how the contextual models can enable and bolster that. However, because I’ve implemented and used so many keyword based systems, that just doesn’t seem to be how my brain works.
An example I’m thinking of is finding a sushi restaurant near me with availability this weekend around dinner time. I’d love to be able to search for this as I’ve written it. How I would search for it would be search for sushi restaurant, sort by distance and hope the application does a proper job of surfacing time filtering.
Conversely, this is mostly how I would build this system. Perhaps with a layer to determine user intention to pull out restaurant type, location sorting, and time filtering.
I could see using semantic search for filtering down the restaurants to related to sushi, but do we then drop back into traditional search for filtering and sorting? Utilize function calling to have the LLM parameterize our search query?
As stated, perhaps I’m not thinking of these the right way because of my experiences with existing systems, which I find seem to give me better results when well built