The next AIs will be trained on vast swathes of low-quality AI-generated outputs, if they are trained on public data again. Presumably people will have to come up with ways to work around that or the AI will be training to produce outputs like a low quality AI.
By low quality I just mean the state of the outputs today, which are incredible for what they are, but are definitely not the pinnacle of what is in theory possible.
Anybody that uses these ai assistants know that the human is still by far the main architect and driver of the code base.
Increasingly advanced AI just means more back/forth between coder and AI, both increasing each other's velocity. AI won't just be trained on other AI-generated code, but more like "cyborg" code. Code that was made by both AI and human together. Code that the human probably wouldn't have been able to accomplish, at least not as quickly or in as much volume, without the AI
Rather than a singularity we might see a "multilairty" where both human and AI become increasingly useful to each other. A situation that takes full advantage of diversity in ways of thinking about and processing information/knowledge
How will they be able to keep purely AI-generated outputs from being fed back in as inputs? That seems hard to separate out once it’s published and not attributed. The ability of AI to generate lots of output means it might swamp human or cyborg outputs when looking at the corpus of publicly searchable code (or blog posts, or whatever the training data is for the case in question).
Maybe a GAN to detect and filter out AI-generated content? Not sure if that’s possible or not.
Well I'm not saying that we should put effort into forcing AI to not train on purely AI-generated work.
All I'm saying is that I believe humans are gonna produce more and better code with the help of AI and that AI models trained on a mix of human and AI-generated code will likely result in smarter AI that is also more receptive to cultural changes
I think it's gonna happen through social evolution. Not something we actively need to work towards
Execute the code to see if it passes the tests. Then you can use it with confidence. Lots of human code is crap too, it needs to be removed. You can use GPT-3 to administer tests and read the results.
By low quality I just mean the state of the outputs today, which are incredible for what they are, but are definitely not the pinnacle of what is in theory possible.