Hacker News new | ask | show | jobs
by magusdei 2006 days ago
Wouldn't the empirical success of GPT-3 in simple programming tasks itself be evidence against this interpretation?

Furthermore, GPT-3 is only a language model because it is trained on textual data. Transformer architectures simply map sequences to other sequences. It doesn't particularly matter what those sequences represent. GPT-2 has been used to complete images, for example: https://openai.com/blog/image-gpt/

1 comments

Empirical success shows that the GPT-3 model has seen the sequence before(maybe many times).

Transformer architectures do map sequences to sequences. What is not known is that the task of programming is a sequence problem. This experiment seems to suggest that maybe its not a sequence problem.