|
|
|
|
|
by jmalicki
75 days ago
|
|
When people judge blindly, the are more likely to think the human is the AI and the AI is the human. 73% judged GPT 4.5 (edit: had incorrectly said 4o before)to be the human. https://arxiv.org/abs/2503.23674 Not only are people bad at judging this, but are directionally wrong. |
|
> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization.
https://arxiv.org/html/2501.15654v2