Hacker News new | ask | show | jobs
by Imnimo 1202 days ago
I'd be curious to see more examination of the questions and answers at the token level, rather than by counting digits or calculating percentage error. For example, according to https://platform.openai.com/tokenizer, 727941 + 761830 is split as 7,279,41, +, 76,18,30. The answer given was 1589771 (as opposed to 1489771). To me that looks like it correctly added 41 and 30, but had trouble with the mis-matched tokenizations of 7,279 and 76,18. I wonder if that sort of pattern would hold in general?
1 comments

The "edit distance" errors would seem to be the tell that yours is a better explanation, perhaps along with problems matching near-edit-distance numbers (tokens) in latent space.

In A operand B equals C any of the 3 numbers can compress to a same space as a 'close' number or numbers in some problem out in web space. So while the author googled the expressions he asked, I wouldn't expect those to be found verbatim when answers are wrong; rather, to your point, the author should web search for tokens, or web search for problems within, say, +/- 3 for each digit of tokens, in all permutations.