| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by magusdei 2053 days ago
	Wouldn't the empirical success of GPT-3 in simple programming tasks itself be evidence against this interpretation? Furthermore, GPT-3 is only a language model because it is trained on textual data. Transformer architectures simply map sequences to other sequences. It doesn't particularly matter what those sequences represent. GPT-2 has been used to complete images, for example: https://openai.com/blog/image-gpt/

1 comments

nutanc 2053 days ago

Empirical success shows that the GPT-3 model has seen the sequence before(maybe many times).

Transformer architectures do map sequences to sequences. What is not known is that the task of programming is a sequence problem. This experiment seems to suggest that maybe its not a sequence problem.

link