Hacker News new | ask | show | jobs
by tptacek 56 days ago
What would I do to demonstrate that they are bad at math? If by "maths" we mean things like working out a double integral for a joint probability problem, or anything simpler than that, GPT5 has been flawless.
1 comments

Search the topic. It is historically documented. It might no longer be true though.

A way to test might be running an open model locally, directly (without a harness) where you could be sure it's not going through a translation layer. I think these days it might have this tool call behavior built in, but I think back in the day it was treated more like a magic trick. Without it, it behaved similar to "how many r's are in strawberry" for simple math.

It is wildly not true.

The request is for some reasonable math problem a model like GPT or Claude will fail at. I'm not going to set up a local model or some harness for it; I'm just going to copy/paste it into ChatGPT and watch it solve it.

Propose a problem, if you think I'm wrong about this. Seems simple.

> wildly not true

Source? Did you search anything like I suggested or no?

My argument: you can take basically any undergraduate collegiate math problem, right now, and it's likely that even the dumb LLM on the Google search page will solve, and nearly certain that frontier models will.

Your argument: "it is possible to Google for people claiming LLMs can't do math".