Hacker News new | ask | show | jobs
by spherelot 632 days ago
> But it has fundamentally no clue about the characters that make up this word (unless someone trained it to do so or by using spurious additional relations that might exist in the training data).

That was my theory as well when I first saw the strawberry test. However, it is easy test if they know how to spell.

The most obvious is:

> Can you spell "It is wonderful weather outside. I should go out and play.". Use capital letters, and separate each letter with a space.

The free tier ChatGPT model is smart enough to understand the following instructions as well which shows that its not just the simple words:

> I was wondering if you can spell. When I ask you a question, answer me with capital letters, and separate each word with a space. When there is real space between the letters, insert character '--' there, so the output is easier to read. Tell me how the attention mechanism works in the modern transformer language models.

Also somebody pointed out in some other HN thread that the modern LLMs are perfect for dyslexic people, because you can typo every single word and the model still understands you perfectly. Not sure how true this is, but at least a simple example seems to work:

> Hlelo, how aer you diong. Cna you undrestnad me?

It would be interesting to know if the datasets actually include spelling examples, or if the models learn how to spell form the massive amount of spelling mistakes in the datasets.

1 comments

They can do this kind of thing, but in my experience, that makes the model feel "dumber" as far as quality of output goes (unless you make it produce normal output first before having it convert it to something else).