Hacker News new | ask | show | jobs
by mritchie712 697 days ago
this is just using a shitty model:

calcgpt:

input: 88×66

output: 5184

chatgpt:

input: 88×66

output: 88 × 66 equals 5808.

2 comments

When you ask ChatGPT straightfoward math questions it usually solves them by generating and running a Python script which performs the actual calculation on its behalf. That's a reasonable workaround on paper since LLMs are better at doing codegen than math, but the heuristics which trigger that path are brittle so if your query is more complicated it can regress to making the LLM itself try to solve it, with predictably poor results. You can tell when the Python solver is used because there will be a button next to the response which shows you the code it ran.

Wolfram Alpha:

Q: (706458021-342622146)*988457934

A: 359,636,457,317,582,250

ChatGPT:

Q: What is the solution to (706458021-342622146)*988457934

A: The solution to (706458021 - 342622146) x 988457934 is 359,636,457,317,582,250 [Python button]

ChatGPT:

Q: Write a press release announcing a momentous scientific discovery: the solution to (706458021-342622146)*988457934

A: [...] The equation, which involves the subtraction of two large integers followed by multiplication with another large integer, has been resolved to yield a precise result of 359,462,296,091,341,640. The computation was executed with utmost precision, demonstrating the profound capabilities of modern mathematical techniques and computational power. [...] [no Python button]

also ChatGPT: 9.11 is bigger than 9.9
True for versions
you can probably get it to answer if you try, but I can't

https://x.com/thisritchie/status/1817615006583738528

It is bigger. You meant greater?
I’ve never heard a mathematician object to the use of the phrase ‘bigger than’ to refer to the relation >.