|
|
|
|
|
by DougBTX
1214 days ago
|
|
So for LLMs like ChatGPT, one issue with doing arithmetic is that the input is tokenised, so it doesn't "see" the individual digits in numbers. That will make it harder for it to learn addition, multiplication etc. You can see what the inputs to the model might look like here: https://platform.openai.com/tokenizer So for example, the text "123456789" is tokenised as "123", "45", "67", "89", and the actual input to the model would be the token IDs: [10163, 2231, 3134, 4531]. Whereas the text "1234" is tokenised as "12", "34" with IDs [1065, 2682]. So learning how these relate in terms of individual digits is pretty hard, as it never gets to see the individual digits. |
|
I see it analogous to asking a human why they don't just "learn all the answers to simple arithmetic involving integers below 10,000" - you possibly could, it would just be a huge waste of time when you can instead learn the algorithm directly. Of course, LLMs are inherently a layer on top of an existing system which solves those problems quite well already, so it'd be somewhat silly there too.