| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by typest 1025 days ago
	It seems to me that RAG is really search, and search is generally a hard problem without an easy one size fits all solution. E.g., as people push retrieval further and further in the context of LLM generation, they're going to go further down the rabbit hole of how to build a good search system. Is everyone currently reinventing search from first principles?

5 comments

zby 1024 days ago

I am convinced that we should teach the LLMs to use search as a tool instead of creating special search that is useful for LLMs. We now have a lot of search systems and LLMs can in theory use all kind of text interface, the only problem is with the limited context that LLMs can consume. But is is quite orthogonal to what kind of index we use for the search. In fact for humans it is also be useful that search returns limited chunks - we already have that with the 'snippets' that for example Google shows - we just need it to tweak a bit for them to be maybe two kind of snippets - shorter as they are now and longer.

You can use LLMs to do semantic search using a keyword search - by telling the LLM to come up with a good search term that would include all the synonymes. But if vector search in embeddings really gives better results than keyword search - then we should start using it in all the other search tools used by humans.

LLMs are the more general tool - so adjusting them to the more restricted search technology should be easier and quicker to do instead of doing it the other way around.

By the way - this prompted me to create my Opinionated RAG wiki: https://github.com/zby/answerbot/wiki

link

isaacfung 1025 days ago

Depends on what you mean by search. Do you consider all Question Answering as search?

Some questions require multi-hop reasoning or have to be decomposed into simpler subproblems. When you google a question, often the answer is not trivially included in the retrieved text and you have to process(filter irrelevant information, resolve conflicting information, extrapolate to cases not covered, align the same entities referred to with two different names, etc), forumate an answer for the original question and maybe even predict your intent based on your history to personalize the result or customize the result in the format you like(markdown, json, csv, etc).

Researchers have developed many different techniques to solve the related problems. But as LLMs are getting hyped, many people try to tell you LLM+vector store is all you need.

link

fkyoureadthedoc 1024 days ago

We're using a product from our existing enterprise search vendor, which they pitch an NLP search. Not convinced it's better than the one we already had consider we have to use an intermediate step of having the LLM turn the user's junk input into a keyword search query, but it's definitely more expensive...

link

mrfox321 1025 days ago

Your intuition on search being implemented is correct.

It's still TBD on whether these new generations of language models will democratize search on bespoke corpuses.

There's going to be a lot of arbitrary alchemy and tribal knowledge...

link

ddematheu 1025 days ago

To some degree. The amount of data that will be brought into search solutions will be enormous, seems like a good time to try to reimagine what that process might look like

link

antupis 1025 days ago

Also this is search for LLM not for humans so optimal solution will be different. Or even with models it is not that hard to imagine that Mistral-8b will need different results than GPT4 which has 1.76 trillion parameters.

link

zby 1024 days ago

I think this is premature optimisation. LLMs are the general tool here - in principle we should try first to adjust LLMs to search instead of doing it the other way around.

But really I think that LLMs should use search as just one of their tools - just like humans do. I would call it Tool Augmented Generation. And also be able to reason through many hops. A good system answer the question _What is the 10th Fibonacci number?_ by looking up the definition in wikipedia, writing code for computing the sequence, testing and debugging it and executing it to compute the 10th number.

link