| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by emmender 1073 days ago
	failed all the logic puzzles with slight tweaks - including stupid monty hall (with transparent doors). BSs with confidence. agi is not knocking at the door.

1 comments

freediver 1073 days ago

Can you share a few of those?

link

emmender 1073 days ago

prove that there are no non negative numbers less than 3

bullshits an answer with confidence (all llms do this)

stupid monty hall

Suppose you're on a game show, and you're given the choice of three transparent doors...

stupid river crossing

A farmer with a wolf, a goat, and a koala must cross a river by boat....

basically, these LLMs have ingested canned solutions and cant reason with newly defined concepts. Anything "out-of-the-box" and they BS canned answers - like the rote student. The BS is particularly distasteful because of the confidence projected in the answer...

So, they are great for looking-up commonly understood "in-the-box" narratives, but are poor at reasoning where there is some novelty. this is what we can expect from a probabilistic "deep" autocompleting machine. unlike a child which can learn ideas and metaphors from a few examples and anomalies.

link

paxys 1073 days ago

You are expecting these models to do something that not even their creators claim they can do. Of course they will fail at it.

link

emmender 1073 days ago

disagree, their creators are hyping these things to no end - to get their next rounds of funding.

link

famouswaffles 1072 days ago

change the terms so it doesn't look the puzzles in its memory and GPT-4 can answer some of these. Reasoning is fine.

link

emmender 1072 days ago

how can you say reasoning is fine - when it fails at basic logic.. ?

we need to coax-it with the right prompts for it to come up with an answer - so, basically it cant reason.

looks like you have an incentive to ignore what you see.

link

famouswaffles 1072 days ago

Seeing a problem you've seen many times and have memorized and plowing through it without "concentrating" enough to see the subtle differences is a failure mode that occurs in humans as well. We don't say "humans can't reason" just because this happens so it makes little sense to say the same for LLMs. The important bit is that it can solve it if nudged from memory, same as people.

link

emmender 1072 days ago

Humans are wired fundamentally to be irrational - our perceptual/cognitive apparatus is deeply flawed - umpteen studies show this - so this is a given.

But, we also discovered a way to think/model which seems to work amazingly - which is the scientific method or reasoning. But this language is not natural to the way humans operate at all. It is a struggle for most of us to think in that manner. thats why math/science is difficult for most of us, and these were discovered only in the last 2000 years.

LLMs cannot yet represent conceptual relationships deterministically/symbolically. At some point in the future, perhaps they can, but the current generation has a long way to go.

link