Hacker News new | ask | show | jobs
by softwaredoug 888 days ago
When I started working in search 10+ years ago, people would build a beautiful UI, and then, only on shipping, realize the search results were trash + irrelevant. They imagined a search system like Elasticsearch was basically Google. When in reality, Elasticsearch is just a bit of infrastructure. A framework, not a solution.

There's a similar thing happening on RAG. Where people think building the chat interaction is the hard thing. The hard thing is extracting + searching to get relevant context. A lot of founders I talk to suddenly realize this at the last minute, right before shipping, similar to search back in the day. It's harder than just throwing chunks in a vector DB. It involves a lot of different backend data sources potentially, and is in many ways harder than a standard search relevance problem (which is itself hard enough).

8 comments

Yep, we're doing RAG-ish search and ranking across many context types and modalities, you definitely can't just use a vectordb and do some chunking/search, there are a wide variety of search-like ranking, clustering, etc. and domain specific work for relevance and it's very hard to measure and prove improvements.

It's going to just evolve into recreating the various search and ranking processes of old just on top of a bit more semantic understanding with some smarter NLG layered in :). It won't be just LLMs, we'll have intent classification, named entity recognition, a personalization layer, reranking, all that fun stuff again.

Especially considering the additional logic that some queries require. Stacked questions, comparative questions, recommendations, questions that assume information found in previous statements / questions.

It becomes a very frustrating experience matching the inherent chaos of a conversation.

Yeah, and to do it well you have to focus on a subset of tasks. Then find a way to gracefully reject anything you can't retrieve well.

In many ways it makes the chat more Siri-like than ChatGPT like. Which may not be what users actually expect.

Great observation. I've seen it often in tech, across the board. It's no better, maybe a step up, than 'idea guy' who 'just' needs someone to build his idea. Hand-waving or complete lack of awareness on the actual value (hard) part.
I spent 8 months telling people this before I got laid off while the CEO continues to chase LLM money with no new ideas or even the talent to solve the problem.

They spent so much time on the UI and basically left the actual search to the last minute, and it was a hilarious failure on launch.

Very good points. Have you seen any examples of systems (or projects) that successfully combine multiple backend data sources, including databases, that perform better than the single backend alone? This seems like an important enough question that it ought to have been documented somewhere.
Hmm, RAG is not "the chat interaction", that's GPT or any other "brain" you choose.

Last week I finished building my 3rd RAG stack for legal document retrieval. Almost-vanilla RAG got me 90-95% of the way. Only drawback is cost, still 10x-100x above the ideal price point; but that will only improve in the future.

True. Pure vectorstores seem limited and kind of overrated. Combining many sources of data is challenging but the right thing to do.
This is a great comment. Good search is really hard. RAG is much harder. At least with search user can pick the best result manually or refine their search. With RAG you pass topK to the LLM and assume its good results. The assumption is that its "semantic search" with vectors so it will just work... wrong.