|
|
|
|
|
by Xenoamorphous
752 days ago
|
|
Why is there so much buzz about RAG? Isn’t it basically a traditional search (either keyword based, vector based -embeddings have been around for years-, or a combination of both) where you take the top N results (usually not even full docs, but chunks due to context length limitations) and pass them to an LLM to regurgitate a response (hopefully without hallucinations), instead of simply listing the results right away? I think some implementations also ask the LLM to rewrite the user query to “capture the user intent”. What I’m missing here? What makes it so useful? |
|
One example is in finance, you have a lot of 45 page PDFs laying around and you're pretty sure one of them has the Reg, or info you need. You aren't sure which so you open them one by one and do a search for a word, then jump through a bunch of those results and decide it's not this PDF. You do that till you find the "one". There are a non trivial amount of Executive level jobs that pretty much do this for half of their work week.
RAG purports to let you search one time.