Hacker News new | ask | show | jobs
by bcherny 1045 days ago
GPT4:

> What is 5*7?

> 5 multiplied by 7 equals 35.

> Why?

> Multiplication is a process of adding a number to itself a certain number of times. When you multiply 5 by 7, it's the same as saying "add 5 to itself 7 times." If you add up seven 5s (5 + 5 + 5 + 5 + 5 + 5 + 5), you get 35. That's why 5 multiplied by 7 equals 35.

> (7 “why?”s later…)

> The ultimate "why" behind many fundamental rules and concepts, including those in mathematics, can often circle back to human needs and the desire to understand and interact effectively with our world.

> As humans, we've developed systems and structures, such as mathematics, to organize, interpret, and predict our surroundings. These systems help us in practical tasks like trading goods, constructing buildings, and planning time, but also in abstract reasoning and problem solving. The rules within these systems, such as the operation of multiplication in mathematics, exist because they are useful, they work consistently, and they can be universally understood and applied.

> So, why does 5 multiplied by 7 equal 35, and why does multiplication work the way it does? Because humans, over many generations, found that this system was a powerful and reliable tool for interacting with the world and with each other.

1 comments

That could still be explained by that example having been part of the training set so it knows how to give the right answer. But there must be more going on than that:

The square root of 232444232 is approximately 15229.

(which is wrong, it is approximately 15246.12)

The sum of 2341347345 and 234823542354 is 237165889699.

Which is the right answer.

So there may be some special casing happening there.

I mean, I don’t know the square root of 232444232 off the top of my head either..
So you'd either work it out and check that it was right. Or you'd tell the person asking that you didn't know. You wouldn't just make a plausible answer and confidently state it. If you did that frequently, people would stop listening to you.
What does the word approximately mean, if we go back to the previous statement answer from GTP 4, and what precision is needed in the answer?

And, no I'd grab a damned calculator and let the specialized tool do the work, which it turns out if you turn on plugin mode GPT-4 can use the same tools and get an exact answer.

That number isn’t correct to any precision and the fact that there wasn’t rounded to the nearest 10, 100 etc… means that no human would recognize it as an approximation to anything but the nearest whole number (which it wasn’t).

>calculator

Well in this case you don’t have access to a calculator because there was no plugin. And you likely wouldn’t just make up a number like chatgpt did.

This perfectly illustrates the danger of chatgpt. It will confidently make up an answer to any question whether or not it has access to specialized tools in a way that most humans won’t.

One difference is that you are aware that you can't do it and state so. Our current LLMs will just give whatever result they think it should be. It might be correct, it might be off by a bit or it might be completely wrong and there's no way for the user to tell apart from double checking with some non-LLM source wich kinda defeats the purpose of asking the LLM in the first place.
if you've had a high school education presumably you could work it out

it can't

I can. Newtons method is pretty easy to do in your head, but with larger numbers you need to be very careful not to mess it up. But on paper it's trivial.
In other words, you need an external tool (paper) which GPT doesn’t have.
No, I actually can, just not for arbitrarily large numbers. And 'paper' is just temp storage, not a tool. A tool in this case would be a sliderule or a calculator or an abacus. Long division requires pen and paper too if you want to 'show your work', just like I would have to show on paper how I did the square root of say 47515, you could simply choose not to believe me if I spat out the answer, but if I showed you step-by-step on paper and you followed every step you would either also conclude that it is correct, or alternatively that I had made a mistake.

That's why I think it is significant that chatgpt gets the addition spot on but gives a wrong answer to the square root problem. I can do better than that off the top of my head and I do not have access to the same computational resources that it has.

Yes, GPT has access to memory.
Actually 15229 is a decent approximation. It’s a better approximation than the one I did off the top of my head.
You need more practice :)

Simple trick: divide by 100, 10000 or 1000000 use Newtons method on the integers, then multiply by 10,100 or 1000 and add a 'fudge factor' based on how large the fraction was...

It's cheating but it can get you pretty close, I'd peg it at 15240 using that trick. If you just want to do the closest squares you can average between 15 (225, too low) and 16 (256, too high) so you'd guess 15500, which is much too high, but one more iteration of Newtons method gets you closer than what chat gpt gives. You can already see that because 225 is much closer than 256 and that puts you closer to 15250 than 15500. And 15250 is actually not a bad guess at all.

And if chatGPT said “I don’t know the actual answer but my best guys is 15229” that would be a reasonable and potentially useful answer.

The fact that it gives you a number that isn’t rounded to the nearest tens, hundreds, or thousands place means that it doesn’t look like an approximation to any reasonable person, which makes it a terrible answer.

My younger brother used to have this problem. If you asked him a question like “how long until you get here”, he’d say “17 minutes”. What he really meant was “around 20 minutes”, but everyone thought he must know the exact time. Like he’d done the drive many times, or he was looking at his GPS.

> So there may be some special casing happening there.

100%. Maths was a notable weakness of earlier GPT versions, so ChatGPT-4 has a layer to direct mathematical queries to an evaluator.