Hacker News new | ask | show | jobs
by jumploops 750 days ago
This is true for traditional full-text document search as well.

When most people mention RAG, they’re using a vector store to surface results that are semantically similar to the user’s query (the retrieval part). They then pass these results to an LLM for summary (the generation part).

In practice, the problems with RAG are similar to the traditional problems of search: indices, latency, and correctness.

1 comments

* indices

Doesn't vector search solve a lot of these problems? These AI vector spaces seem like a really easy win here, and they're reasonably lightweight compared to a full LLM.

* Latency

I don't want to call this a solved problem, but it is one that scales horizontally very easily and that a lot of existing tech is able to take advantage of easily

* Correctness

They LLM tooling doesn't necessarily need to make things worse here, although poorly designed it definitely could. AI can do a first pass at fact checking, even though I suspect we'll need humans in the loop for a long while.

---

I think that vector-space at least bring some big advantages for indexing here, being able to search for more abstract concepts.

* indices

> Doesn't vector search solve a lot of these problems? These AI vector spaces seem like a really easy win here, and they're reasonably lightweight compared to a full LLM.

Yes and no. What do you vectorize? The whole document? The whole page? The whole paragraph? How you split your data, and then index into it, is still problem-space dependent.

* Latency

> I don't want to call this a solved problem, but it is one that scales horizontally very easily and that a lot of existing tech is able to take advantage of easily

Any time you add steps, you increase latency. This is similar to traditional search where you e.g. need to fetch relevant data but scored based on some user-specific metric. Every lookup adds latency. Same is true for RAG.

* Correctness

> They LLM tooling doesn't necessarily need to make things worse here, although poorly designed it definitely could. AI can do a first pass at fact checking, even though I suspect we'll need humans in the loop for a long while.

Again, this comes back to how you index your data and what results are returned; similar to traditional search. This is problem-space dependent. Plus, we haven't solved LLM hallucinations -- there are strategies to mitigate it, but not clearcut solution.