Hacker News new | ask | show | jobs
by infogulch 757 days ago
The other day I was wondering if LLMs are bad at at maths because they don't have readily apparent access to the concept of "columns". Apparently the answer is yes.

Vertical alignment across lines is pretty important for humans to learn operations on digits, but the way we encode lines with a \n separator doesn't really help. In a recent codebullet video gpt really struggled with any kind of vertical alignment task. I wonder if it would do better on a fixed 80 column width...

2 comments

Isn't it more that they don't have ready access to the much-more-fundamental concept of decimal numbers?

My understanding was that they tokenized them into chunks and tried to learn associations between the chunks, the same as if one was breaking apart English words.

So "2+2=4" isn't being treated that differently from "all's well that ends well." This might lead to a kind of Benny's Rules [0] situation, where sufficient brute-force can make a collection of overfitted non-arithmetic rules appear to work.

[0] https://blog.mathed.net/2011/07/rysk-erlwangers-bennys-conce...

The current gen llms tokenize numbers digit by digit unlike earlier llms.
They don't. Which you can easily check with any of the dozen web apps currently implementing the GPT-4o tokenizer.
No, it doesn't help. Bloomberg tried this and it didn't seem to make much difference.
If someone else is interested in the Bloomberg tokenizer:

https://medium.com/generative-ai-insights-for-business-leade...

Fascinating article!
It looks like the math-notation formatting didn't survive, for that you might want to see a PDF, ex: https://people.wou.edu/~girodm/library/benny.pdf
wouldnt presenting numbers in reverse order, with the least significant digit on the left and most significant on the right help with the reasoning?
They do that in the paper