| I foresee a dystopian education outcome: 1. Classifiers like this are used to flag possible AI-generated text 2. Non-technical users (teachers) treat this like a 100% certainty 3. Students pay the price. Especially with a true positive rate of only 26% and a false positive rate of 9%, this seems next to useless. |
This is the part that needs to be addressed the most. Teachers can't offload their critical reasoning to the computer. They should ask their students to write things in class and get a feeling for what those individual students are capable of. Then those that turn in essays written at 10x their normal writing level will be obvious, without the use of any automated cheat detectors.
I was once accused of cheating by a computer; my friend and I both turned in assignments that used do-while loops, which the computer thought was so statistically unlikely that we surely must have worked together on the assignment. But the explanation was straight forward; I had been evangelizing the aesthetic virtue of do-while loops to anybody that would listen to me, and my friend had been persuaded. Thankfully the professor understood this once he compared the two submissions himself and realized we didn't even use the do-while loop in the same part of the program. There was almost no similarity between the two submissions besides the statistically unlikely but completely innocuous use of do-while loops. It's a good thing my professor used common sense instead of blindly trusting the computer.