|
|
|
|
|
by michaelt
757 days ago
|
|
> What is the point of this work? [...] We already know how to hard-code a (literally) infinitely more accurate addition machine. There are many situations where it is useful for the LLM to get basic arithmetic right. For example, if someone asks your LLM to explain this line of code [1] which takes a 28x28 px input image, is the right explanation that 28×28÷4×64=9216 ? Or is that the wrong explanation? And being able to get 100-digit arithmetic right 99% of the time might make use feel reassured that the 4-digit arithmetic we need from the model will be right an even higher % of the time. [1] https://github.com/pytorch/examples/blob/37a1866d0e0118875d5... |
|