|
|
|
|
|
by crimsoneer
126 days ago
|
|
I mean, the flipside is that we have been tricking humans with this sort of thing for generations. We've all seen a hundred variations on
"A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?" or "If 5 machines take 5 minutes to make 5 widgets, how long do 100 machines take to make 100 widgets?" or even the whole "the father was the surgeon" story. If you don't recognise the problem and actively engage your "system 2 brain", it's very easy to just leap to the obvious (but wrong) answer. That doesn't mean you're not intelligent and can't work it out if someone points out the problem. It's just the heuristics you've been trained to adopt betray you here, and that's really not so different a problem to what's tricking these llms. |
|
It may trigger a particularly ambiguous path in the model's token weights, or whatever the technical explanation for this behavior is, which can certainly be addressed in future versions, but what it does is expose the fact that there's no real intelligence here. For all its "thinking" and "reasoning", the tool is incapable of arriving at the logically correct answer, unless it was specifically trained for that scenario, or happens to arrive at it by chance. This is not how intelligence works in living beings. Humans don't need to be trained at specific cognitive tasks in order to perform well at them, and our performance is not random.
But I'm sure this is "moving the goalposts", right?
[1]: https://news.ycombinator.com/item?id=47060374