Hacker News new | ask | show | jobs
by qazpot 1180 days ago
I think you are taking a very major assumption that LLMs are deterministic infact they are exactly opposite. They are probabilistic systems.

- They do not transpile Rough English to Deterministic English. Infact they do not do any transpiling at all.

- LLMs learn the probabilities of words in the dataset for all contexts from that dataset (context is a ordered set of words). This is called training.

- Once training is done, LLMs can generate text given a prompt and the probabilities that it has learned. The analogy of LLMs being auto-complete on steroids is very apt.

- Whether the text generated by LLMs is factual or not is purely coincidental.

I would highly recommend watching and working though the code (if possible) of NanoGPT by Andrej Karpathy. https://www.youtube.com/watch?v=kCc8FmEb1nY

A lot of things which seem like magic about LLMs will get demystified.

Now one can argue that LLMs are showing human like reasoning/intelligence/sentience as an emergent behavior. This is hard to argue against because all these terms are extremely hard to define.

IMO, the only emergent behavior that LLMs are showing is the output they generate looks like it might have been generated by a human which should not be surprising given that LLMs like ChatGPT has trained on a large amount of human written text available on internet.

3 comments

On your first point you couldn't be more wrong. LLMs are deterministic. They are run on a deterministic machine and the entailment of that forces them to be so. Also probabilistic does not mean the absence of determinism. For example, for every non-deterministic/probabilistic finite automata there is an equivalent deterministic one.
You are arguing metaphysics but this conversation is about computer science.

In computer science, pseudorandomness is considered non-deterministic. Determinism is a function of the inputs, not the machine state.

https://en.m.wikipedia.org/wiki/Deterministic_algorithm#:~:t....

LLMs are fully deterministic in that sense: same input, same outputs.

Because full determinism is not always desirable, the researchers have implemented an explicit "temperature" parameter that you can use to inject randomness to the outputs. If you set that to 0.0 you will always receive the same output for the same input and model version.

LLMs can be implemented to be formally deterministic but if you ask them to solve a specific problem instance you have not seen before, you cannot generally guarantee they will do so reliably. So you're correct in a pedantic sense but I think GP's perspective is more useful if you are problem solving.
It's not pedantry. The parent commentator is simply stating something incorrect and its very misleading at a conceptual level for people who won't think about it too much. And you can guarantee they will do so reliably unless you parameterize the input with some "truly" (yea i know) random input.
> - Whether the text generated by LLMs is factual or not is purely coincidental.

No, because the probability of a word on the internet being factual is not coincidental. Factuality compresses the corpus; the truth is generally the simplest explanation for a set of observations. (The collected text of the internet is a set of observations about reality.)

> IMO, the only emergent behavior that LLMs are showing is the output they generate looks like it might have been generated by a human

The whole point of the Turing Test is to stop people from asking "yes, it acts indistinguishable from a human but is it human?" "Generating output that looks human" is in fact the entirety of AGI.

- If factuality were just a matter of simplicity, it wouldn't be so incredibly difficult to determine.

- The corpus of written language is full of ambiguity and contradictory statements.

- A lie makes it's way half way around the world before the truth can even get its pants on.

- What's thought true today will not be thought true tomorrow. This happens sometimes in the direction of veracity. Sometimes, the other way around.

- Factuality and consensus are not the same.

> it wouldn't be so incredibly difficult to determine.

Never said it was easy. :P

> - The corpus of written language is full of ambiguity and contradictory statements.

Right, but the truth is the one set of information that logically cannot be contradictory. That gives it an advantage in terms of compression.

The rest is correct, but just means that the learning algorithm has a harder time discovering truth, not that it's impossible.

> > it wouldn't be so incredibly difficult to determine.

> Never said it was easy.

My point is that this process of determination happens over time and in a non-linear fashion, so the corpus contains tons of noise around any given truth statement.

> > The corpus of written language is full of ambiguity and contradictory statements.

> Right, but the truth is the one set of information that logically cannot be contradictory.

Please refer to Godel's incompleteness theorem.

The disconcerting part is wondering if our brains act a lot like LLMs, in many ways.

Sometimes, I can almost catch my brain generating the next word from what I’ve said so far, or listening to what some else is saying.

They probably do in some ways