|
|
|
|
|
by famouswaffles
757 days ago
|
|
>What is the point of this work? 99% on 100-digit arithmetic means there's a 0% chance anyone will ever use a Transformer as an ALU or anything of the kind. We already know how to hard-code a (literally) infinitely more accurate addition machine. Nobody's going to be replacing calculators with transformers sure but many are and will be using transformers to solve problems arithmetic is a necessary component of. >So what is the point of this? Transformers are supposed to be the "sparks of AGI" and they can almost do arithmetic if we try very hard to shove it down their heads? Who cares? You don't need to shove anything down for transformers to get arithmetic. Just changing how numbers are tokenized works. But that requires an entire retrain so why not explore other techniques? And what does any of this have to do with AGI ? You know how terrible humans are at arithmetic right ? |
|
E.g., take chess. Modelling a game of chess as a game tree and searching the game tree by adversarial search is a human invention. Humans are pretty crap at searching a game tree beyond a handful of ply, but we can program a computer to go dozens of ply deep across thousands of branches, and beat any human.
So the challenge for AI is not to get computers to calculate when we know how the calculation is to be performed. The challenge is to get computers to create their own models. And that's a grand, open challenge that is not even close to be solved, certainly not by LLMs. Yann LeCun and Yoshua Bengio have said similar things.
The linked work doesn't move the needle any closer to that and it just shows progress in calculating arithmetic using a transformer, which we already know how to do in a myriad different ways and much more accurately. Hence my criticism for it.