Regardless of emergence, in the context of "putting safety at the frontier" I would expect Claude 3 to be augmented with very basic tools like calculators to minimize such trivial hallucinations. I say this as someone rooting for Anthropic.
An "LLM crawler app" is needed -- in that you should be able to shift Tokenized Workloads between executioners in a BGP routing sort of sense...
Least cost routing of prompt response. especially if time-to-respond is not as important as precision...
Also, is there a time-series ability in any LLM model (meaning "show me this [thing] based on this [input] but continually updated as I firehose the crap out of it"?
--
What if you could get execution estimates for a prompt?
What a joke of a response. No one is asking for emergent calculation ability just that the model gives the correct answer. LLM tools (functions etc) is old news at this point.