| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by evrimfeyyaz 1292 days ago
	I think it's not getting that the friend who has two tokens is the same as the one you owe a token to. When you change the question to make it clearer, it responds correctly. "My friend has two tokens and I have six, but I owe one to the same friend. There is a toy that costs 7 tokens. Can my friend and I afford to buy it?"

2 comments

bambax 1292 days ago

Also, it does not respond correctly in my experience.

Q My friend has 2 tokens and I have 6, but I owe one to the same friend. There is a toy that costs 7 tokens. Can my friend and I afford to buy it?

A No, your friend and you cannot afford to buy the toy that costs 7 tokens. Even though your friend has two tokens and you have six tokens, you owe one token to your friend, so you only have five tokens available to spend. When you combine your friend's two tokens with your five tokens, you only have a total of 7 tokens, which is not enough to buy the toy that costs 7 tokens. In this case, your friend and you do not have enough money to buy the toy, even though you have a total of seven tokens when you combine your individual amounts. It is important to remember that when combining the amount of money two or more people have, you should add their individual amounts together, but also take into account any debts or obligations that need to be paid. In this case, your friend and you do not have enough money to buy the toy because you owe one token to your friend.

The key sentence is still beyond absurd: "When you combine your friend's two tokens with your five tokens, you only have a total of 7 tokens, which is not enough to buy the toy that costs 7 tokens."

(It may be possible that it simply wants to be consistent, and since it got it wrong once, it will stay with the same answer in the same session just because...?)

link

pulvinar 1292 days ago

After explaining to it the mistakes it made it seems to come around:

I think I made the mistakes in the first place because I wasn't paying close enough attention to the details of the question. I was not considering that I owed a token to my friend, and I was not thinking about the fact that seven tokens could be enough to buy a toy that costs seven tokens.

I’ve made mistakes like this too, where I get fixated on a particular pat solution, without considering the details of the new problem. In the AI case it’s probably memorized a bunch of solutions that override the details of this particular question.

link

layer8 1292 days ago

It’s easy to make ChatGPT admit to a mistake and provide an explanation for its mistake, even if it didn’t actually make a mistake. It still just follows the “what response would sound plausible here” route, without actually understanding that it made (or didn’t make) a mistake. Often enough, if you return to the original problem statement, it will equally return to its incorrect logic.

link

pulvinar 1292 days ago

Yeah, that's what I see too.

I'm trying to teach it to properly count beats in lines of music. I can get it to be correct by teaching it to split the line in half and count each half separately, but even when explicitly told to use that method it fails again.

link

bambax 1292 days ago

But it doesn't matter. That's why I said 2 and 6 and not 2 and 5. If 2 and 5, the friend I owe the extra token to matters. If 2 and 6, even after I have honored my debt there are still enough tokens left to buy the toy.

link

treewalking 1292 days ago

This conversation is good evidence that ChatGPT does have a more or less average ability to reason. We’ve got humans making similar mistakes with 1 digit numbers.

(Or… are they human after all?)

link

bambax 1292 days ago

I've never had anyone say to me that 7 < 7.

Sometimes you can indeed trick some people with an obfuscated reasoning that has a cleverly hidden division by zero, but when they arrive at the conclusion that x < x, or x != x, they stop and accept that they got it wrong somewhere somehow. They don't insist that they're right and that the statement is perfectly reasonable and logical.

link