Hacker News new | ask | show | jobs
by ru552 752 days ago
*What makes it so useful?

One example is in finance, you have a lot of 45 page PDFs laying around and you're pretty sure one of them has the Reg, or info you need. You aren't sure which so you open them one by one and do a search for a word, then jump through a bunch of those results and decide it's not this PDF. You do that till you find the "one". There are a non trivial amount of Executive level jobs that pretty much do this for half of their work week.

RAG purports to let you search one time.

3 comments

This is true for traditional full-text document search as well.

When most people mention RAG, they’re using a vector store to surface results that are semantically similar to the user’s query (the retrieval part). They then pass these results to an LLM for summary (the generation part).

In practice, the problems with RAG are similar to the traditional problems of search: indices, latency, and correctness.

* indices

Doesn't vector search solve a lot of these problems? These AI vector spaces seem like a really easy win here, and they're reasonably lightweight compared to a full LLM.

* Latency

I don't want to call this a solved problem, but it is one that scales horizontally very easily and that a lot of existing tech is able to take advantage of easily

* Correctness

They LLM tooling doesn't necessarily need to make things worse here, although poorly designed it definitely could. AI can do a first pass at fact checking, even though I suspect we'll need humans in the loop for a long while.

---

I think that vector-space at least bring some big advantages for indexing here, being able to search for more abstract concepts.

* indices

> Doesn't vector search solve a lot of these problems? These AI vector spaces seem like a really easy win here, and they're reasonably lightweight compared to a full LLM.

Yes and no. What do you vectorize? The whole document? The whole page? The whole paragraph? How you split your data, and then index into it, is still problem-space dependent.

* Latency

> I don't want to call this a solved problem, but it is one that scales horizontally very easily and that a lot of existing tech is able to take advantage of easily

Any time you add steps, you increase latency. This is similar to traditional search where you e.g. need to fetch relevant data but scored based on some user-specific metric. Every lookup adds latency. Same is true for RAG.

* Correctness

> They LLM tooling doesn't necessarily need to make things worse here, although poorly designed it definitely could. AI can do a first pass at fact checking, even though I suspect we'll need humans in the loop for a long while.

Again, this comes back to how you index your data and what results are returned; similar to traditional search. This is problem-space dependent. Plus, we haven't solved LLM hallucinations -- there are strategies to mitigate it, but not clearcut solution.

Any tips on effectively getting financial data out of PDFs into a RAG system (especially data contained in tables)? And locally, not via proprietary cloud PDF parsing thingy. That's the current nut I'm trying to crack.
https://github.com/VikParuchuri/marker is solid, but slow and needs gpu(s) to be practical
You might find my library useful - https://github.com/Filimoa/open-parse
I’m probably missing the point: doesn’t https://pdfgrep.org solve this problem?
What if they don’t remember the regulation code?

”What is the regulation that covers M&A of companies in the pharmaceutical industry?”

It seems much easier to get that response from a LLM than searching words with grep.

I built a web version with WASM at https://pdfgrep.com a few years ago in case it’s helpful to anyone