|
|
|
|
|
by lrei
926 days ago
|
|
GPT-4 was clearly trained to fix typos and handle not well written written requests. That much is visible directly from just using it within chatGPT UI in normal usage and fits common user scenarios (eg fix my bad draft). We know it was trained on social media data from Reddit much of which is not great writing either. Now I'm wondering if it was trained on (imperfectly) OCRed data too... |
|
I haven't read the paper so I'm not sure if they did this, but it would be interesting to see at what point it breaks down. Just scrambling up letters within words makes it pretty easy for the LLM; what if you also start moving letters between words, or take out the spaces between words?