Hacker News new | ask | show | jobs
by razorbeamz 11 days ago
LLMs are not the ideal tool for this job, because LLMs cannot do math or count.
4 comments

Most human programmers are also fantastically bad at math.
True but irrelevant.

This 8-track duplication puzzle is a problem of math.

LLMs beat humans at generating code (and fixing broken one) and letting CPU execute the code
> LLMs cannot do math

This is plainly not true anymore

No, they fundamentally cannot do math. They are next token predictors, not calculators.
Why can't a next token predictor do math? Humans aren't calculators either, but we can do math.

If you want proof just look at the benchmarks. Modern frontier models can get basically perfect accuracy on American Invitational Mathematics Examination tests: https://matharena.ai/?comp=aime--aime_2026

If you want an explanation of how they do math, we've found geometric calculators inside their neural networks: https://www.goodfire.ai/research/a-geometric-calculator#

But LLM can write code that can do math and count. Tool use, more broadly, has proven to be a very powerful way to let LLMs do what they're good at (handle the fuzzy and imprecise nuances of natural language, which includes the scooping of a lot of context) and delegate other things they're not good at to external tools, some of which if can write on the spot.

If you think about it, we humans do that all the time too.

I'm crap at 4 digit multiplication in my head, but I have no problem doing that with pencil and paper

> But LLM can write code that can do math and count.

They cannot, however, execute that code. They can feed that code into an external program they've been given access to, but they can't execute it themselves.

You presumably have no problem moving around in a car that you only control indirectly via a steering wheel, an accelerator and a brake pedal without ever actually powering the wheels