| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by HarHarVeryFunny 515 days ago
	No - you can give the LLM a list of letters and it STILL won't be able to count them reliably, so you are guessing wrong about where the difficult lies. Try asking Claude: how many 'r's are in this list (just give me a number as your response, nothing else) : s t r a w b e r r y

1 comments

danielmarkbruce 515 days ago

How many examples like that do you think it's seen? You can't given an example of something that is in effect a trick to get character level tokenization and then expect it to do well when it's seen practically zero of such data in it's training set.

Nobody who suggests methods like character or byte level 'tokenization' suggests a model trained on current tokenization schemes should be able to do what you are suggesting. They are suggesting actually train it on characters or bytes.

You say all this as though I'm suggesting something novel. I'm not. Appealing to authority is kinda lame, but maybe see Andrej's take: https://x.com/karpathy/status/1657949234535211009

link

HarHarVeryFunny 514 days ago

So, one final appeal to logic from me here:

1) You must have tested and realized that these models can spell just fine - break a word into a letter sequence, regardless of how you believe they are doing it

2) As shown above, even when presented with a word already broken into a sequence of letters, the model STILL fails to always correctly count the number of a given letter. You can argue about WHY they fail (different discussion), but regardless they do (if only allowed to output a number).

Now, "how many r's in strawberry", unless memorized, is accomplished by breaking it into a sequence of letters (which it can do fine), then counting the letters in the sequence (which it fails at).

So, you're still sticking to your belief that creating the letter sequence (which it can do fine) is the problem ?!!

Rhetorical question.

link

HarHarVeryFunny 514 days ago

Tasks like reversing a list (Karpathy) or counting categories within in are far harder than simple prediction - the one thing LLMs are built to do.

Try it for yourself. Try it on a local model if you are paranoid that the cloud model is using a tool behind your back.

link