| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by iammjm 54 days ago
	Why doesn’t it just call tools such as Mathematica for such operations?

5 comments

ACCount37 54 days ago

For the same reason you don't run "4+6" on a calculator.

External tool call has an overhead. It requires a round trip into an external tool. It requires an LLM to run in agentic autoregression - it can't be used in prefill.

Which means that having native arithmetic capabilities is useful. Forward pass arithmetics are an LLM version of quick mental math.

An LLM can read "#define SILLY_TIME_CONST (3*20*60*60*1000)" and have "SILLY_TIME_CONST is 60 h expressed as 216000000 ms" already cached by the end of the line, before it even emits its first token.

defrost 54 days ago

This is more how an LLM thinks about math internally - an LLM version of drilled tables being used for mental arithmetic "as humans do".

When humans stall on these tasks, they reach for pen and paper, a slide rule, a calculator, etc.

Mathematica is overkill for arithmetic, in addition it's licenced and can cost a bit extra.

If an LLM were to reach for a light cheap arithmetic tool something like bc would be a good first stop - a CLI tool with a language that supports arbitrary precision numbers with interactive execution of statements.

https://en.wikipedia.org/wiki/Bc_(programming_language)

jampekka 54 days ago

They do. I asked CharGPT for 327 x 48 and it used the "ChatGPT Instruments" calculator.

Previously it used to run Python scripts, and may still do for more complex calculations.

steveBK123 54 days ago

What's interesting is that one one hand LLM pumps are claiming a path to AGI.. while on the other hand, they are duct-taping in deterministic plugins for specific prompt types they find it better to offload...

In X years is it just going to be a thin OS-like layer where a majority of work is being handled by other "programs".

beernet 54 days ago

> while on the other hand, they are duct-taping in deterministic plugins for specific prompt types they find it better to offload

So, in essence, just like human beings?

BobbyTables2 53 days ago

How creditable would Claude be if it couldn’t answer “1+2=3?”

Worse, this is really human beings trying to pretend that their AI is AGI.

steveBK123 54 days ago

My point is what makes this terribly different than Alexa skills

grey-area 54 days ago

For this category of problems, no, very unlike human beings.

steveBK123 54 days ago

Right.. plumbing in specific plugins for specific prompt forms feels like an expert system, rather than some general purpose intelligence.

Also big picture its hard to see it as some sort of self-improving intelligence if humans are hand crafting these paths and tools for it.

BobbyTables2 53 days ago

Exactly, an expert system marketed to nonexperts…

tzs 54 days ago

That doesn't seem very persuasive. The one example of a non-A GI we have, humans, does the same thing. We've been offloading arithmetic for at least 4000 years.

BobbyTables2 53 days ago

Sure but we don’t pretend otherwise…

singpolyma3 54 days ago

> In X years is it just going to be a thin OS-like layer where a majority of work is being handled by other "programs"

That is my hopeful ideal

steveBK123 54 days ago

In which case it’s just a neat extension of search

ragebol 53 days ago

I was thinking the same thing. Why not call into a dedicated math tool?

But I don't as well, and I have some intuition about numbers that I would probably not have if I always relied on calculators. Would the same sort of thing apply to LLMs? I'm probably anthropomorphising here...

breezybottom 54 days ago

ChatGPT does, and has since 2023