Hacker News new | ask | show | jobs
by lrei 926 days ago
GPT-4 was clearly trained to fix typos and handle not well written written requests. That much is visible directly from just using it within chatGPT UI in normal usage and fits common user scenarios (eg fix my bad draft). We know it was trained on social media data from Reddit much of which is not great writing either. Now I'm wondering if it was trained on (imperfectly) OCRed data too...
5 comments

I wonder if it's more of an emergent property you get for free with LLMs rather than something that needs specific training. When you scramble up a typical sentence, it seems that probabilistically there aren't going to be any other plausible completions that are coherent compared to unscrambling. It's basically unscrambling vs. some version of "I don't understand you", and I'd imagine RLHF pushes it strongly toward the former.

I haven't read the paper so I'm not sure if they did this, but it would be interesting to see at what point it breaks down. Just scrambling up letters within words makes it pretty easy for the LLM; what if you also start moving letters between words, or take out the spaces between words?

> Now I'm wondering if it was trained on (imperfectly) OCRed data too...

Or perhaps they inserted typos automatically in the training set as data augmentation. Tactics like that is known to increase the roboustness of some models, so why not?

Yup totally plausible. Things like word (token) dropout and inserting random uniform noise into embeddings or just edit distance perturbations to the tokens are all well known but still Figure 1 looks extremely impressive.
>trained to fix typos

It is trained on data which may include typos, but that is very different from fixing typos. It knows what words likely come after typos in the same way it knows what words likely come after regular words.

No, that's not what I meant. I meant that in its reinforcement learning phase, GPT saw examples of "fix this text" style requests and was rewarded for doing a good job. That's different from seeing examples of typos and still predicting the right word which happens during the language model self supervised training. Both likely help it be good at it.
Non-RLHF models can do this just fine.

Even non-finetuned 7B models, 3 orders of magnitude smaller than GPT-4, can unscramble text and fix typos reliably.

Half, or better, of the things people discover "GPT-4 can do" can be done with non-RLHF GPT-3 from 2020 or with a model 1000x smaller.

Have a look at the examples in the PDF. It's not typos/spelling errors/OCR errors, it's anagrams.