|
|
|
|
|
by fcantournet
435 days ago
|
|
"Participants had 5 minute conversations simultaneously with another human participant and one of these systems before judging which conversational partner they thought was human.
When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant." That's the opposite of a Turing test pass : it shows a very clear bias in selection is present, which means the LLM is significantly different from humans (at least in this test setting). If the test setting was : 1 humans talk to chatbot and after 5m decides yes/no on human, then yeah that would be a very impressive result. But in the test setting of this paper, surely a success would be as close as possible to a 50%, i.e: statistically impossible to separate humans and LLMs. |
|