|
|
|
|
|
by magusdei
2006 days ago
|
|
Wouldn't the empirical success of GPT-3 in simple programming tasks itself be evidence against this interpretation? Furthermore, GPT-3 is only a language model because it is trained on textual data. Transformer architectures simply map sequences to other sequences. It doesn't particularly matter what those sequences represent. GPT-2 has been used to complete images, for example: https://openai.com/blog/image-gpt/ |
|
Transformer architectures do map sequences to sequences. What is not known is that the task of programming is a sequence problem. This experiment seems to suggest that maybe its not a sequence problem.