Hacker News new | ask | show | jobs
by HappMacDonald 303 days ago
Have you ever seen what these arbitrary length whole numbers look like once they are tokenized? They don't break down to one-digit-per-token, and the same long number has no guarantee of breaking down into tokens the same way every time it is encountered.

But the algorithms they teach humans in school to do long-hand arithmetic (which are liable to be the only algorithms demonstrated in the training data) require a single unique numeral for every digit.

This is the same source as the problem of counting "R"'s in "Strawberry".

2 comments

That's was the initial thinking of anyone which I explained this, it was also my speculation, but when you look in it's reasoning where it do the mistake, it correctly extract the digits out of the input token. As I say in another comments, most of the mistakes her happen when it recopy the answer it calculated from the summation table. You can avoid tokenization issue when it extract the answer by making it output an array of digits of the answer, it will still fail at simply recopying the correct digit.
I recently saw someone that posted a leaked system prompt for GPT5 (and regardless of the truth of the matter since I can't confirm the authenticity of the claim, the point I'm making stands alone to some degree).

A portion of the system prompt was specifically instructing the LLM that math problems are, essentially, "special", and that there is zero tolerance for approximation or imprecision with these queries.

To some degree I get the issue here. Most queries are full of imprecision and generalization, and the same type of question may even get a different output if asked in a different context, but when it comes to math problems, we have absolutely zero tolerance for that. To us this is obvious, but when looking from the outside, it is a bit odd that we are so loose and sloppy with, well basically everything we do, but then we put certain characters in a math format, and we are hyper obsessed with ultra precision.

The actual system prompt section for this was funny though. It essentially said "you suck at math, you have a long history of sucking at math in all contexts, never attempt to do it yourself, always use the calculation tools you are provided."

o/~ Mathematics keeps your intellect intact / many answers should be carefully exact

But for daily application, use a close approximation, round it off.. o/~

> But the algorithms they teach humans in school to do long-hand arithmetic (which are liable to be the only algorithms demonstrated in the training data) require a single unique numeral for every digit.

But humans don't see single digits, we learn to parse noisy visual data into single digits and then use those single digits to do the math.

It is much easier for these models to understand what the number is based on the tokens and parse that than it is for a visual model to do it based on an image, so getting those tokens streamed straight into its system makes its problem to solve much much simpler than what humans do. We weren't born able to read numbers, we learn that.