|
|
|
|
|
by camrobjones
924 days ago
|
|
I think the key thing to remember with the human benchmark is that it will change depending on interrogators' assumptions about human abilities. When models are bad, humans are easy to spot. But if the model's pretty good, it's harder to be sure you're talking to a human. On top of that, I think a lot of human users didn't want to get got by the model, so they had an a priori bias toward saying AI. |
|