Hacker News new | ask | show | jobs
by strbean 309 days ago
How is counting letters a measure of understanding, rather than a rote process?

The reason LLMs struggle with this is because they literally aren't thinking in English. Their input is tokenized before it comes to them. It's like asking a Chinese speaker "How many Rs are there in the word 草莓".

2 comments

It shows understanding that words are made up of letters and that they can be counted

Since tokens are atomic, which I didn't realize earlier, then maybe it's still intelligent if it can realize it can extract the result by writing len([b for b in word if b == my_letter]) and decide on its own to return that value.

But why doesn’t the LLM reply “I can’t solve this task because I see text as tokens”, rather than give a wrong answer?