| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Terr_ 784 days ago
	My rule of thumb is to take every single LLM prompt and just imagine that it's prefixed with: "Computer, focus on generating output that resembles the words people in the past used after they were given the following words..."

3 comments

inopinatus 784 days ago

Correct. You must perceive them as plausibility engines. The unstated hypothesis is that plausibility of output may converge towards correctness of output with increasing scale and sophistication. This hypothesis remains very far from proven.

link

__loam 784 days ago

I don't think it's that hard to understand what the hell is going on with LLMs under the hood. Ultimately it's a weighted sample of the training data. It has a relationship with reality insofar as one exists within the training data. HFRL makes it easier to believe something crazy is happening because the output is being weighted towards something that's believable to us.

link

dTal 784 days ago

Depending on what you mean by "weighted sample", that's either trivially true (the network is of course a function of its training data and nothing else) or trivially false (the network generalizes over the training data and has not memorized it). It is not a good intuition pump for why an LLM can hold up one end of a conversation, or follow novel instructions - it is not reading from a script, nor regurgitating chopped up pieces of text like a Markov chain. It is doing something very clever in a way that is not obvious.

>It has a relationship with reality insofar as one exists within the training data

This is true of anything that learns.

link

dartos 783 days ago

> this is true of anything that learns

Sure, but most things that learn have actual reality as a training set. LLMs have human curated data, which isn’t and can’t be perfectly representative of reality.

link

__loam 783 days ago

Couldn't have said it better myself.

link

bamboozled 784 days ago

I also think you get the best results when thinking about using them this way too, any other way of using them seems to end in disappointment.

link

jmprspret 784 days ago

That is accurate to what they do. I think others need to imagine this as well. Far too many nontechnical people seem to treat them as some kind of Oracle.

link

panarky 784 days ago

Your understanding of how LLMs work is overly simplistic and incomplete.

Yes, doing probabilistic next-word prediction plays a role in how LLMs generate text output, but that's not the whole story.

LLMs "understand" (to a degree): They develop complex internal representations of concepts they've been trained on. This isn't just about word association; they develop an understanding of the relationships between objects, actions, and ideas.

They can reasoning, not just mimic: LLMs can perform logical reasoning, using their internal knowledge base to solve problems or answer questions. This might involve following multi-step instructions, drawing inferences from information provided, or adapting to new prompts in a way that requires a degree of abstract thinking.

Beyond simple probabilities: Yes, LLMs do consider the probability of certain word sequences, but their output is far more sophisticated than just picking the most likely next word. They weigh context, concepts, relationships, nuance, logic, and even the unstated but inferred purpose of the user when generating responses.

link

richardw 784 days ago

I know that's the messaging, but the real link to reality is very tenuous. This was a great example from the last couple days:

https://twitter.com/colin_fraser/status/1785132544482226679

I just tried a similar question now with ChatGPT4:

"If a man and a goat are on one side of a river, what is the minimum amount of trips required to get the man and goat to the other side in a boat. Assume the boat can hold at most one animal and one human."

ChatGPT: 3 trips

That is very much closer to "trying to predict next word from examples" than "billion-dollar model with internal reasoning".

link

jack_pp 784 days ago

That sounds surprisingly close to how a toddler might reason, only difference is the toddler can eventually see the flaw in their reasoning if you press them long enough while the LLM doesn't have the architecture for learning in real time yet

link

richardw 784 days ago

I was repeatedly amazed at how smart my toddler was. You just feel the general intelligence.

She's a bit older now (5) but e.g. a few days ago I was talking about cleaning the whole house. She said "you didn't clean the WHOLE house, look there's something you didn't clean".

link

xcv123 784 days ago

The LLM does figure it out if you ask further questions in the same chat. Here's GPT-3.5 https://chat.openai.com/share/a8669390-8eb0-46c2-b804-3aafc3...

link

godelski 784 days ago

If you spoil it with your followup questions... which doesn't help because the point of these is that they're controlled experiments where you do know what the right answer and logic is. You can't test when you don't.

link

xcv123 784 days ago

It's not spoiling anything. It's just an observation of the limits of current LLMs.

I tried a few chain of thought prompts for the original question and GPT-3.5 was sometimes (randomly) able to find the correct answer on the first attempt for this one

https://chat.openai.com/share/c144ba23-2f78-4cc8-a1c5-ca3106...

link

richardw 784 days ago

I tried "are you sure", which often triggers some reasoning, and it was pretty confident. I'm trying not to give it the answer, but run it as if I didn't have any special knowledge. GPT + human > GPT. I mean, we're treating these things like another kind of intelligence, not a hammer.

GPT4: https://chat.openai.com/share/1beb5391-d321-4515-945e-38233f...

link

sam0x17 784 days ago

Another really oddly effective one is offering it a $500 tip

link

Jensson 784 days ago

> their output is far more sophisticated than just picking the most likely next word

Picking the most likely next word is an extremely sophisticated algorithm, if you could do that you would almost sound human. Like an LLM!

However, focusing on just picking the most likely next words do cause some issues, such as you being biased towards words that were posted instead of those that were too boring to post, so there are still things that such an algorithm can't understand like the probabilities since those are so unevenly posted.

link

SrslyJosh 784 days ago

That's a lot of anthropomorphizing, but I don't see anything to back up your claims.

link

retrac 784 days ago

I think it's still more in the realm of philosophy. But I do have an argument that NNs demonstrate abstract, generalized learning: the transfer learning effect.

Neural networks pre-trained on data for a completely different task, learn new tasks much faster. With a GPT-like transformer, you can feed it PCM audio samples encoded as uuencoded text, or paintings encoded in the same way, and it learns how to translate English <-> Russian when later trained on that, much faster than from a completely randomized model. There's something common to those seemingly disparate tasks that is learned. "Abstraction" may be the right word for this.

link

rdlecler1 783 days ago

Abstraction seems too generous of an interpretation.

A more parsimonious hypothesis is that random networks start out broken, structurally incapable of computation because the structure has parts where information stops flowing or signal gain is so low at certain choke points that it’s presence is like a random coin flip.

Training the network to compute ANYTHING fixes this flow problem, making subsequent training easier, without introducing any kind of abstraction.

link

xcv123 784 days ago

Neural Networks have been studied for a long time. We learned this in 1990.

https://news.ycombinator.com/item?id=40230764

link

godelski 784 days ago

> They can reasoning, not just mimic

I'm still waiting for this. Here's some recent tweets about people with variations of the river crossing puzzle. All LLMs seem to fail pretty badly.

LLMs are impressive enough as they are (compression systems with human language interface), you don't need to hype them up to something they're not.

https://twitter.com/jeremyphoward/status/1783712611126964627

https://twitter.com/WaltonStevenj/status/1785145923771011215

https://twitter.com/colin_fraser/status/1785132544482226679

link

seizethecheese 784 days ago

I feel like this is the midwit meme response and the idiot and genius are just saying “it’s next token probabilistic geberation”

link

xcv123 784 days ago

A typical mid-wit response is to say it's just a Markov chain doing naive next token prediction without any semantic model. That's not how deep learning works.

The LLM neural network contains a semantic model and it performs some type of reasoning over that model. The idiot and the genius both can see that ChatGPT has some reasoning capability.

https://www.cs.toronto.edu/~hinton/absps/AIJmapping.pdf

"This 1990 paper demonstrated how neural networks could learn to represent and reason about part-whole hierarchical relationships, using family trees as the example domain.

By training on examples of family relations like parent-child and grandparent-grandchild, the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training.

This seminal work highlighted that neural networks can go beyond just memorizing training examples, and instead learn abstract representations that enable reasoning and generalization"

link

hervature 784 days ago

I would tend to agree with seizethecheese. I don't think anyone says "Markov chain doing naive next token prediction without any semantic model" emphasis on the last part of that sentence. The idiot only says the first part and disregards the last part. The genius says the first part with an understanding that, no matter how complex a semantic model is, the "reasoning" that occurs is completely a function of the training data. Change the data, change the weights. At no point in the model does anything guarantee consistency of logic or anything like that. Certainly, LLMs are not interested in any notion of truth in the philosophical sense.

Here is a succinct example using Claude:

Prompt:

Using the decimal system, redefine all symbols by shifting them one to right such that 0 becomes 1, 1 becomes 2, 2 becomes 3, etc. Now, let's do math. What is 2x3?

===

Okay, with the symbols shifted one to the right in the decimal system:

0 becomes 1

1 becomes 2

2 becomes 3

3 becomes 4

4 becomes 5

5 becomes 6

6 becomes 7

7 becomes 8

8 becomes 9

9 becomes 0 (looping back around)

So if 2 becomes 3, and 3 becomes 4, then:

2 x 3 = 3 x 4 = 12

Shifting those digits one to the right based on the new mapping: 12 becomes 23

Therefore, under this new symbol mapping, 2 x 3 = 23

===

Regardless, if you are interpreting 2x3 to be in the original space, then 2x3 = 6 which would be 7 in the new space. If you are interpreting 2x3 to be in the new space (which is how I would interpret it), then 2x3 is 1x2 = 2 which is 3.

I think it's clear from this example that the LLM has 0 ability to reason.

link

xcv123 784 days ago

> I think it's clear from this example that the LLM has 0 ability to reason.

It's not a 0 or 1. You are oversimplifying it. Obviously neural networks can learn to generalize patterns of reasoning inferred from their training data. We know that they are not using explicitly defined formal systems of reasoning, and they have some limitations compared to those systems. Anyone who seriously studied neural networks or machine learning understands this.

By the same logic, practically every human on Earth has "0 ability to reason" as their biological neural network will get confused and make mistakes.

link

krainboltgreene 784 days ago

Anyone who has studied neural networks also knows there's no comparison between computer neural networks and human biological neural networks. The name was picked because of a passing familiarity with the biological by someone who didn't have any experience in biological neural networks. It's been sufficiently proven they have no similarity by countless academics.

link

hervature 784 days ago

It actually is 0 or 1 in this case. You either have the power of deduction or you do not. You have either proved a theorem or you have not. If you got to a correct conclusion through incorrect means, you have incorrectly reasoned. There is no spectrum in reasoning. Perhaps a spectrum in abilities across humans but not in the logic itself.

> By the same logic, practically every human on Earth has "0 ability to reason" as their biological neural network will get confused and make mistakes.

And therein lies the problem with this whole debate. I think a huge part of the debate is conflating the fact that most humans do not reason well (I wouldn't say they cannot reason) and make mistakes means that reasoning is something fuzzy and make statements like "LLMs reason about as well as humans". Very few humans outside of mathematicians practice logic on a daily basis. Most humans get by with muscle memory and pattern recognition of previous tasks. Just because LLMs are roughly as good as humans at this behavior does not make them able to reason. I would be totally fine if people just replaced "can reason" with "are useful" within their statements so they would look more like "LLMs are as useful as humans in answering MCAT tests." To imply there is a rational actor deriving responses from first order logic is disingenuous in my opinion.

link

causal 784 days ago

Yeah I think the word probabilistic has been broadly misconstrued, because we usually see it in contexts where probabilities are used as shortcuts to avoid comprehensive analysis. But a probability can also be the output of a detailed analysis with rich semantic representation of causes.

link

__loam 784 days ago

I mean, that's exactly what LLMs are. In the absence of any real understanding of cognition, we are just throwing shit at the wall and seeing what the probabilistic model does with the massive amount of data we give it. A definitive analysis, were one even possible with our current models of computing, would probably outperform an LLM.

link

__loam 784 days ago

I hate this industry so much

link