Hacker News new | ask | show | jobs
by wrsh07 757 days ago
This is cool, but special casing digits is unsatisfying.

It makes me think that the authors have correctly identified an issue (positional embeddings) but don't propose a general solution.

I'm not sure if such a thing is possible, but if it is, it would feel more complete. (Fwiw, positional embeddings have had issues for a long time! So a general solution to this would benefit more than just arithmetic. Helpfully, we now have a really good specific example to serve as a baseline for any generalization we seek)

1 comments

but it makes sense to have a different encoding. Mathematics is a completely different language. Maybe we should have more than one class of encodings.
There were some recent posts (either here or reddit) supporting the claim that different regions activate when reading programs vs when reading text. If we take that to be true; and squint just enough, one could claim that arithmetic and mathematics should be treated differently to language.
Numeracy is definitely associated with different brain regions than just reading. See, e.g. https://www.sciencedirect.com/science/article/pii/S105381191...

(Dehaene also has a book, “The Numbet Sense”)

I would only find that satisfying (from a snobbish and impractical perspective) if we were able to have the model decide: 1) what encoding should this section use? 2) how should I train this encoding?

A mixture of experts but for encodings is interesting, though!

Maybe there's a clean way to implement

For arbitrary documents and queries, how do we reliably segment the text between those two different languages? And if we can do that, why can't the model do it implicitly?