Hacker News new | ask | show | jobs
by schneems 56 days ago
They are bad at math. But they are good at writing code and as an optimization some providers have it secretly write code to answer the problem, run it and give you the answer without telling you what it did in the middle part.
3 comments

Someone should tell the mathematicians if they use a calculator or a whiteboard or heavens forbid a computer they are "bad at math".
1) That's not related to chain of thought I was replying to. Someone asked about the "bad at math" and pointed out "but it seems good to me" so I added the color of why that might be the case. Your retort seems to imply I'm making an argument that because something uses tools for a job it cannot be good at the thing it's using a tool for. Which is not the case.

2) If you have something to say, just say it. Don't put words in my mouth and then argue with a thing I didn't say.

Right, but your narrative was incorrect and based on faulty premises, which you haven't acknowledged. That's fine, except you're still pressing the argument.

Can you please present a reasonable maths problem that I can bounce off GPT so we can see it fail? I can give you many hundreds of relatively complex problems, none of which have appeared in a textbook, that GPT has not only solved, but critiqued my own crappy solutions for. I'm only asking you for one counterpoint.

> your narrative was incorrect and based on faulty premises

I am referring to specific, documented behavior of LLMs. Google it.

Google any plausibly reasonable math problem, and even the terrible LLM that powers the Google search page will almost certainly solve it correctly for you.

I don't need to reconstruct my argument axiomatically from folk beliefs.

You seem to have misunderstood my comment. I'm happy to accept the fault for poor communication. But you're making it hard. You're signaling that any clarifications on my behalf will be treated as further arguments instead of some sort of shared desire to hear one another. I don't care to continue.
What would I do to demonstrate that they are bad at math? If by "maths" we mean things like working out a double integral for a joint probability problem, or anything simpler than that, GPT5 has been flawless.
Search the topic. It is historically documented. It might no longer be true though.

A way to test might be running an open model locally, directly (without a harness) where you could be sure it's not going through a translation layer. I think these days it might have this tool call behavior built in, but I think back in the day it was treated more like a magic trick. Without it, it behaved similar to "how many r's are in strawberry" for simple math.

It is wildly not true.

The request is for some reasonable math problem a model like GPT or Claude will fail at. I'm not going to set up a local model or some harness for it; I'm just going to copy/paste it into ChatGPT and watch it solve it.

Propose a problem, if you think I'm wrong about this. Seems simple.

> wildly not true

Source? Did you search anything like I suggested or no?

My argument: you can take basically any undergraduate collegiate math problem, right now, and it's likely that even the dumb LLM on the Google search page will solve, and nearly certain that frontier models will.

Your argument: "it is possible to Google for people claiming LLMs can't do math".

Are they bad at math? Or are they bad at arithmetic?
if you don't know much math, it's easy to confuse the two
Neither.