Hacker News new | ask | show | jobs
by clementneo 1215 days ago
I think there's probably some truth to this. They found that in InstructGPT — where they teach the model to better follow instructions, which was the jump from GPT-3 to ChatGPT — they found that the model also learnt to follow non-English instructions, even though the extra training was done almost exclusively in English[1].

So there seems to be such emergent mechanisms in the model that have arisen because of the end-to-end training, which we don't exactly understand yet.

[1] https://twitter.com/janleike/status/1625207251630960640