Hacker News new | ask | show | jobs
by quickthrower2 1068 days ago
This are assuming LLMs are intelligent and can think "hey I am dumb, I'll look that up".

What they are literally doing is guessing the next word, a word a time but doing it really really well and making statistically average output over a very large number of inputs.

There is no distinction between understanding "the" vs "a" and telling me 1+1=3. It is all token generation.

3 comments

What they are doing depends entirely on what decoding algorithm you use. An LLM is mostly a token probability function, but it's not just that - a transformer model is capable of learning anything. Tokens are the interface, not necessarily the implementation.
A transformer can only memorize, it doesn't learn to do.

For what that concerns us here: LLMs will never learn to fact-check anything. They'll blindly regurgitate the facts they have been "taught", but never consider or evaluate "the paper cited for this fact on wikipedia is a bunch of bullshit".

Any attempt to use them to produce "facts" is ultimately just folly, in the same way Google's attempt to do so with it's search engine index is.

> [LLMs] never consider or evaluate "the paper cited for this fact on wikipedia is a bunch of bullshit".

Nor do people, though! This is setting the bar way too high.

The whole point to having edited reference sources like "encyclopedias" is that so that we can rely on the expertise of the editors in lieu of having to develop the expertise ourselves[1].

No, an LLM that simply knows a priori (via prompt hacking) which sources are trustworthy would be absolutely comparable to the way an educated-but-non-expert human approaches sources.

[1] Which is a chicken and egg problem anyway. Everyone starts with edited reference sources as tutorial material. Quite frankly everyone starts learning with wikipedia.

This is setting the bar way too high.

No. If these things are claimed to be sources of truth, then the bar needs to be that high.

It is precisely because people don't fact-check that the bar has to be so high.

> If these things are claimed to be sources of truth

That's a strawman, though. No service, nor human, "claims to be a source of truth" in the kind of profound sense you seem to be using. It stops, everywhere, at "Wikipedia (or whatever) said it and I trust it".

The only way to get access to deeper expertise is to (1) BE an expert and (2) engage in an discussion with another.

No, a transformer is a universal function approximator and is capable of learning to do anything to some degree of accuracy.

GPT doesn't do math correctly but it also doesn't just memorize it.

It seems to me that LLMs are basically an algorithmic encoding of Occam's Razor. The issue seems to be that what is most probable does not always correspond to what happens, or what makes the most sense to an embodied person.

What is most probable is not always what is most correct or most accurate.

Isn't this a serious simplification? Tokens are just the medium