Hacker News new | ask | show | jobs
by spmurrayzzz 774 days ago
> I mention this because RAG is perfect for these kinds of use cases, where you really can't afford the hallucination - where you need its information to be based on specific cases - specific information.

I think it's worth cautioning here that even with attempted grounding via RAG, this does not completely prevent the model from hallucinating. RAG can and does help improve performance somewhat there, but fundamentally the model is still autoregressively predicting tokens and sampling from a distribution. And thus, it's going to predict incorrectly some of the time even if its less likely to do so.

I think its certainly a worthwhile engineering effort to address the myriad of issues involved, and I'd never say this is an impossible task, but currently I continue to push caution when I see the happy path socialized to the degree it is.

1 comments

Sure, everything has some margin of error, even conventional tech: I can say "at the end of the day it's just SQL queries so there's some chance of a mistake" or "at the end of the day a human could read it wrong", no tech is completely foolproof, even writing.

RAG/LLMs are a clear improvement to the baseline though. People will unfairly judge LLMs even when they provide more accuracy and better results, even if they save lives, simply because they can't meet the impossible demands of neo-luddites. People want it to be like "an evil force" and I blame OpenAI and the news for this narrative.

This take reminds me of some of the (weaker) arguments against blockchain when it was popular. For some - just because there was not a 100% chance a blockchain can prevent every conceivable exploit and hack it was therefore useless hype - they ignore the decentralization utility, throw out the peer-to-peer ledger concept, throw out the consensus protocols, etc. How could something like git have been invented in such a political, anti-tech environment? Git would have been shut down by the masses, otherwise smart people would label it as a scary evil force. Thankfully peer-to-peer was very cool back then and so git is useful tech that we get to use.

I'm seeing the same thing with LLMs, all people are focused on is: Prove to me AI isn't evil - people can see a valuable use case in a demo but it doesn't matter, I think like blockchain some are beyond convincing. They just aren't into technology anymore.

> I'm seeing the same thing with LLMs, all people are focused on is: Prove to me AI isn't evil - people can see a valuable use case in a demo but it doesn't matter, I think like blockchain some are beyond convincing. They just aren't into technology anymore.

You might be shadowboxing a bit with a point I didn't make (or maybe your comment was intentionally orthogonal to what I raised, not sure). I work with this technology every day in a professional, commercial context. Not just LLMs, but many other ML/DL implementations that walk the gamut of downstream tasks from anomaly detection, time series forecasting, etc. I think its useful enough to be building real things with it to improve the way my business functions. In the efforts of building those inference and training stacks from scratch, I've also seen how spectacularly they can fail and how often.

I don't think AI is evil. I think autoregressive token prediction is stochastic enough to be considered unreliable in its current state. That doesn't mean I am going to stop building things with it, it just means that I've seen these systems implode regularly enough, even with grounding via RAG, that I tend to push caution first and foremost (as I did in my original message).

Sorry - straw manning on internet comments is so bad I shouldn't have even gone there with the crypto analogy, couldn't help because I see parallels with regards to general reception.

I agree with what you said here 100%.

Working with it daily I can't help but be slightly more optimistic though. I see LLMs as being a major component of future apps. You have servers, databases, game engines, and now there's this generative token thing you can use for... quite a lot - without an internet connection no less. It will only get better.

The fact that RAG isolates specific document data in a db and is based on regular database querying IME solves the problem with regular LLM accuracy, but yeah ofc still could be some errors like with anything