Hacker News new | ask | show | jobs
by stubybubs 940 days ago
I gave it the three lightbulbs in a closet riddle.

https://puzzles.nigelcoldwell.co.uk/seven.htm

The key complication is "once you've opened the door, you may no longer touch a switch." It gets this. There are many examples of it written out on the web. When I give it a variation and say "you can open the door to look at the bulbs and use the switches all you want" and it is absolutely unable to understand this. To a human it's simple: look at the bulbs and flick the switches. It kept giving me answers about using a special lens to examine the bulbs, using something to detect heat. I explained it in many ways and tried several times. I was paying for GPT-4 at the time as well.

I would not consider this thinking. It's unable to make this simple abstraction from its training data. I think 4 looks better than 3 simply because it's got more data, but we're reaching diminishing returns on that, as has been stated.

1 comments

GPT-4 on platform.openai.com says this on the first try:

Switch on the first switch and leave it on for a few minutes. Then, switch it off and switch on the second switch. Leave the third switch off. Now, walk into the room.

The bulb that is on corresponds to the second switch. The bulb that is off and still warm corresponds to the first switch because it had time to heat up. The bulb that is off and cool corresponds to the third switch, the one you never turned on.

GPT-4-0314: 1. Turn on the first switch and leave it on for about 5 minutes. 2. After 5 minutes, turn off the first switch and turn on the second switch. 3. Open the door and enter the room.

Now observe the lights: - The bulb that is on is connected to the second switch (which is currently on). - The bulb that is off but warm to the touch is connected to the first switch (it was on long enough to heat up the bulb). - The bulb that is off and cool to the touch is connected to the third switch (it was never turned on).

----

But– It's also trained on the internet. GPT-4 paper 'sparks of AGI' had a logical puzzle it most likely never encountered in the training data that it could solve.

Also– I encourage you to ask these types of logical puzzles on the street to rando's. They're not easy to solve.

My question to you would be: What would convince you that it actually can 'think' logically?

I think your comment misunderstands the comment you're responding to.

The point is that while LLMs can solve the puzzle when the constraints are unchanged -- as you said, there are loads of examples of people asking and answering variations of this puzzle on the internet -- but when you change the constraints slightly ("you can open the door to look at the bulbs and use the switches all you want") it is unable to break out of the mold and keeps giving complicated answers, while a human would understand that under the new constraints, you could simply flip each switch and observe the changes in turn.

A similar example that language models used to get stuck on is this: "Which is heavier, a pound of feathers or two pounds of bricks?"