For the same reason you don't run "4+6" on a calculator.
External tool call has an overhead. It requires a round trip into an external tool. It requires an LLM to run in agentic autoregression - it can't be used in prefill.
Which means that having native arithmetic capabilities is useful. Forward pass arithmetics are an LLM version of quick mental math.
An LLM can read "#define SILLY_TIME_CONST (3*20*60*60*1000)" and have "SILLY_TIME_CONST is 60 h expressed as 216000000 ms" already cached by the end of the line, before it even emits its first token.
This is more how an LLM thinks about math internally - an LLM version of drilled tables being used for mental arithmetic "as humans do".
When humans stall on these tasks, they reach for pen and paper, a slide rule, a calculator, etc.
Mathematica is overkill for arithmetic, in addition it's licenced and can cost a bit extra.
If an LLM were to reach for a light cheap arithmetic tool something like bc would be a good first stop - a CLI tool with a language that supports arbitrary precision numbers with interactive execution of statements.
What's interesting is that one one hand LLM pumps are claiming a path to AGI.. while on the other hand, they are duct-taping in deterministic plugins for specific prompt types they find it better to offload...
In X years is it just going to be a thin OS-like layer where a majority of work is being handled by other "programs".
That doesn't seem very persuasive. The one example of a non-A GI we have, humans, does the same thing. We've been offloading arithmetic for at least 4000 years.
I was thinking the same thing. Why not call into a dedicated math tool?
But I don't as well, and I have some intuition about numbers that I would probably not have if I always relied on calculators.
Would the same sort of thing apply to LLMs? I'm probably anthropomorphising here...
External tool call has an overhead. It requires a round trip into an external tool. It requires an LLM to run in agentic autoregression - it can't be used in prefill.
Which means that having native arithmetic capabilities is useful. Forward pass arithmetics are an LLM version of quick mental math.
An LLM can read "#define SILLY_TIME_CONST (3*20*60*60*1000)" and have "SILLY_TIME_CONST is 60 h expressed as 216000000 ms" already cached by the end of the line, before it even emits its first token.