Hacker News new | ask | show | jobs
by Gormo 565 days ago
Of course the LLM is bullshitting the user. That's precisely its purpose: LLMs are tools that generate comprehensible sounding language based on probability models that describe what words/tokens tend to be found in proximity to each other. An LLM doesn't actually know anything by reference to verifiable, external facts.

Sure, LLMs can be used as fancy search engines that index documents and then answer questions by referring to them, but even there, the probabilistic nature of the underlying model can still result in mistakes.

3 comments

Models do know things. Facts are encoded in their parameters. Look at the some of the interpretability research to see that. They aren't just Markov chains.
Nope. They don't know any specific facts. The training data produces a probability matrix that reflects what words are likely to be found in relation other words, allowing it to generate novel combinations of words that are coherent and understandable. But there is no mechanism involved for determining whether those novel expressions are actually factual representations of reality.
Again, read the papers. They absolutely do know facts, and that can be seen in the activations. Your description is oversimplified. It's easy to get models to emit statistically improbable but correct sequences of words. They are not just looking at what words are near by each other, that doesn't lead to the kind of output LLMs are capable of.
Exactly. People forget that we did make systems that were just Markov chains long before LLMs, like the famous Usenet Poster "Mark V. Shaney" (created by Rob Pike of Plan 9 and Golang fame) that was trained on Usenet posts in the 1980s. You didn't need deep learning or any sort of neural nets for that. It could come up with sentences that sometimes made some sort of sense, but that was it. The oversimplified way LLMs are sometimes explained makes it sound like they are no different from Mark V. Shaney, but they obviously are.

https://en.wikipedia.org/wiki/Mark_V._Shaney

Yeah I get that, but at the same time we have AI hype men talking out of both sides of their mouth:

> This model is revolutionary, it knows everything, can answer anything with perfect accuracy!

“It’s fed me bullshit numerous times”

> OF COURSE it’s bullshitting you, don’t you know how LLMs work?

Like how am I supposed to take any of this tech seriously when the LLM is always answering questions as if it had the utmost confidence in what it is spitting out?

Hilariously, that really does basically define “bullshitting”.
Bullshit in the Frankfurtian sense.

There is a recent paper that explains it: https://link.springer.com/article/10.1007/s10676-024-09775-5