| > It's not clear to me that larger models will solve this in the limit; take the way GPT3 fails on addition past a certain number, and the fundamental inability for transformers to learn certain algorithms. GPT-3 was OpenAI exercise in how far pure scaling can get you. They have used some 2 years old method. Already at the point when they started training GPT-3 there were readily available remedies to many of GPT-3 issues. Given how they energized the wider community I'm sure even more focus will be given to improving language models in the following years. Some rough ideas right now: - People think that cherry-picking the best GPT-3 examples is cheating - why? Train a model that will be selecting the best examples for you. My proposition is to train a model that guesses whether some text was GPT-3 generated or human made - select samples that look the most human like. - Use a good search method to look for the best samples. Monte Carlo Tree Search? AlphaZero? MuZero? If MuZero can play a games of Chess, Shogi, Go and all of Atari then way should it not be able to play a game of what word will come next? - Hook up the language model to a search engine. Instead of writing a whole program yourself, why not to copy-paste some stuff from StackOverflow with some slight modifications? Etc. It doesn't address the issues with agency, grounding and multi-modality, but it's a good road map for the next 2-3 years. |
What you said is essentially: "Train a better GPT model". Humans have trouble distinguishing between (some of) GPT-3 and human writing. The only way to build a classifier that can do this is to build a model that is better than GPT-3 at understanding text. It would need to have features currently absent in GPT-3, such as common sense and understanding the world (e.g. causality, physics, psychology, history, etc). If what you say could be done, GPT-3 would have been designed as a GAN.