| > Also, I'm not sure if it even treated individual digits as separate tokens, but it might. Someone with API access can check. Anyone can check, they have a tool for that[1]. It's mentioned in their FAQ article[2]. According to their tool, GPT-3 counts the following as one token: - any combination of or below 3 digits - 1111, 3333, 6666, 9999 (it tends to group other digits in groups of 2) - 66666666 (so 8 sixes -- 5, 6 or 7 won't work) - 00000000 (anything below 8 zeros counts as one token as well, probably to handle millions and billions) - 0000000000000000 (16 zeros) This isn't an exhaustive list, there are probably a lot of other weird edge cases I haven't tried. Its failure to understand basic arithmetic makes much more sense given how inconsistent the tokenizing of digits is done. [1]: https://platform.openai.com/tokenizer [2]: https://help.openai.com/en/articles/4936856-what-are-tokens-... |