Hacker News new | ask | show | jobs
by 2OEH8eoCRo0 1271 days ago
I asked it a bunch of gibberish words and it got them all correct.
1 comments

My mental model is that if you give it real words, it uses approximately one token per word, and it may or may not know how many letters are in the word - it will have learned how many letters there are only if that information was in its training. Just like any other fact it learns about words. It is not counting the letters.

If you give it a gibberish word, it will represent it with one letter per token and be actually able to more or less count tokens in order to figure out how many letters there are.

So this ends up looking like it can count letters in most words, real and fake. Perhaps it would do poorly with real but uncommon words.

>more or less count tokens

Which is what I meant by saying "approximate" because it can "count" the number of tokens.