|
|
|
|
|
by pedrovhb
809 days ago
|
|
My intuition is that a significant challenge for LLMs' ability to do arithmetics has to do with tokenization. For instance, `1654+73225` as per the OpenAI tokenizer tool breaks down into `165•4•+•732•25`, meaning the LLM is incapable of considering digits individually; that is, "165" is a single "word" and its relationship to "4" and in fact each other token representing a numerical value has to be learned. It can't do simple carry operations (or other arithmetic abstractions humans have access to) in the vast majority of cases because its internal representation of text is not designed for this. Arithmetic is easy to do in base 10 or 2 or 16, but it's a whole lot harder in base ~100k where 99% of the "digits" are words like "cat" or "///////////". Compare that to understanding arbitrary base64-encoded strings; that's much harder for humans to do without tools. Tokenization still isn't _the_ greatest fit for it, but it's a lot more tractable, and LLMs can do it no problem. Even understanding ASCII art is impressive, given they have no innate idea of what any letter looks like, and they "see" fragments of each letter on each line. So I'm not sure if I agree or disagree with you here. I'd say LLMs in fact have very impressive capabilities to learn logical structures. Whether grammar is the problem isn't clear to me, but their internal representation format obviously and enormously influences how much harder seemingly trivial tasks become. Perhaps some efforts in hand-tuning vocabularies could improve performance in some tasks, perhaps something different altogether is necessary, but I don't think it's an impossible hurdle to overcome. |
|
The tokens are just the input - the internal representation can be totally different (and that format isn't tokens).