Hacker News new | ask | show | jobs
by ekez 814 days ago
The metric the authors use confuses me.

Edit distance seems like a strange way to test if the model understands arithmetic ([1], Figure 3). I think `1+3=3` would be equally as correct as `1+1=9`?

Why not consider how far off the model is `abs(actual-expected)`? I wonder if there is an inflection point with that metric.

https://arxiv.org/abs/2206.07682

1 comments

It depends on how you do arithmetic. If you're a human and you do column addition, 12345+35791=58136 is just as big of a mistake as 48146 (the actual result is 48136). It's just one mistaken column in both. Binary half-adders work the same way.

We don't really know how LLMs do arithmetic. Maybe token edit distance would be interesting, but either way it doesn't really change the claim of the paper.

Unrelated: The link is incorrect, the one you're referring to is here: https://arxiv.org/pdf/2304.15004.pdf

Yeah, and as an aside I wonder how hard it would be to train an LLM to do addition, multiplication, etc, human style? Presumably it should be possible at least in step-by-step style (as substitute for short-term memory), the same way that we do it.

Without using an algorithmic approach, it seems an LLM can only learn a bunch of partially correct heuristics, and attempt to generalize over examples.

I've played with this a bit in the past, and came to the conclusion that GPT-3 seems to have learnt to compare the size of numbers (whether accurately or via heuristics), and would get the approximate size of an answer right (depending on the task), even if not the actual value right. I seem to recall it also doing this for tasks like asking for a prime number greater than a particular value.

I mean, is it efficient to teach them addition human style instead of heuristics of when to call the right function?

Imagine you could say 'calc' in your brain, and some separate subcomponent of your brain that is far more power efficient could return an answer almost instantly? You would not focus on understanding addition/subtraction, more you would focus on on when to use addition/subtraction/multiplication or whatever.

> is it efficient to teach them addition human style

Not if math is your only goal, but there'd be value in making the models more powerful so that they could learn to do simple things like this (and not so simple things) by themselves. You can't have a tool for everything, and hopefully future AGI can itself do things that are more than just a mashup of existing tool capabilities.