It looks like it will work, although haven't tested the exact code. Has anyone tested it and if so, this really shouldn't be downvoted.
If the SO users start downvoting bot-generated answers that are correct and working, I think that's a sign that SO is much less relevant. They should definitely downvoted them if the code doesn't work though.
There are perfectly valid reasons to downvote AI answers, no matter the content:
- Whoever submitted the question very likely doesn't understand the question well enough to answer it themselves, so any feedback is not going to get a reasonable answer.
- The amount of time to check whether an answer is correct is non-zero. If you could somehow know that the answer was written by a human that ensures that the effort on the part of the answerer was non-trivial, and the "proof of work" infuses the answer with a minimum amount of trust which is absent in the case of a generated answer. Compare to a spam email: you wouldn't read all of the emails in your spam box thoroughly to determine if any of them contained a nugget of truth. You'd assume ill intent, and treat the contents accordingly.
Both answers have been deleted and https://stackoverflow.com/help/deleted-answers says that only mods or folks with over 10,000 reputation can view them. I actually don't offhand know if deleted answers show up in the stack exchange content dumps, in order to be able to view them in there
If its working code and indistinguishable from a human answer to anyone reading it, are there really any repercussions? I guess problems would surface if the model at some point is allowed to search the internet and start inbreeding its own answers.
The bot would need to learn as well as a "reasonable human" from being corrected on SO, and also be able to react in a socially appropriate way to correction (both in the subject thread and in future postings), otherwise it is a downgrade, even if initial answer is identical.
My experience with OpenAI is that it is very good at exactly this, because it is so good at understanding context and follow-up questions. I was able to make it produce code that appeared correct, but was basically pseudocode with correct syntax, so it compiles/runs, but does essentially nothing. However, when prompted to actually make working code and explain how and why it works, it does so. And its also socially appropriate, not rude and what else you could/would expect when being called-out or corrected on its bullshit. I can only imagine future versions of the current AI model will be even better at this.
I think your two final sentences capture the essence of what I was going to respond. "It doesn't matter if the answer is convincing-looking and wrong. It needs to work / be syntactically correct at a minimum, which OpenAI seems good at. However, the OP and others needs to test and evaluate if the proposed answer solve the original problem. And if its not, it will quickly be revealed as such, and "downvoted" or whatever stackoverflow functionality exists to indicate bad answers. This applies to both human and AI-generated answers."
Yeah, absolutely. My position on AGIs for a long long time has been that they're great tools for a) generating insights into a large amount of data very quickly, and b) generating new instances of <thing> from examples of <thing> to help with exploring the possibility space of <thing>; but that any output or conclusion they generate _must_ be checked by a Human In The Loop, or at the very least their actions must be reversible without damage in the case of error.
If the SO users start downvoting bot-generated answers that are correct and working, I think that's a sign that SO is much less relevant. They should definitely downvoted them if the code doesn't work though.