| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by glitchc 1014 days ago
	What about the ketchup test? Ask it to tell you how many times the letter e appears in the word ketchup. Llama always tells me it's two.

4 comments

aqme28 1014 days ago

Spelling challenges are always going to be inherently difficult for a token-based LM. It doesn't actually "see" letters. It's not a good test for performance (unless this is actually the kind of question you're going to ask it regularly).

link

gsuuon 1014 days ago

I've found it's more reliable to ask it to write some javascript that returns how many letters are in a word. Works even with Llama 7b with some nudging.

link

ttul 1014 days ago

Falcon fails. GPT-3.5 also fails this test. GPT-4 gets it right. I suspect that GPT-4 is just large enough to have developed a concept of counting, whereas the others are not. Alternatively, it's possible that GPT-4 has memorized the answer from its more extensive training set.

link

mk67 1013 days ago

It's not possible to count letters for an LLM; it only "sees" tokens.

link

neel8986 1014 days ago

Bard can also give correct result

link