Hacker News new | ask | show | jobs
by bun-neh 4611 days ago
From what I read, I don't believe it is. In my opinion it should also never be used for this purpose.

While you could probably catch a lot of cheaters this way, there is a possibility for a large false positive rate. If this is true then I would especially advise against deploying this type of software in a traditional university since the academic dishonesty policies can often cause significant and undue harm on an innocent student.

1 comments

Good comment, but I wouldn't say never.

As an instructor of programming on a university level, I like to think that I have enough sense to know that particularly for "trivial" assignments, some similarity is expected. However, as I've encountered, a great deal of similarity over multiple assignments (and exams) between two students of the same nationality who sit together in class provides additional evidence of plagiarism.

So, yes, I agree a single data point of similarity is insufficient, but a history of similarity, particularly in complex projects, becomes more damning.

I got flagged as a freshman for "55% similarity" (whatever that meant) to another students submission in a "learn how to write shit in C++" type assignment. As far as I could tell, the only thing that triggered the software was the fact that both I and the other kid used do-while loops, while nobody else in the course did. The rest of the programs were semi-similar, just a few lines of cout/cin/<</>>/... to ask your name and echo it back.

So basically what I'm saying here is that I think "for "trivial" assignments, some similarity is expected" isn't always widely understood, to the detriment of students.

I think these sort of systems become most valuable when used to check work against work submitted from previous years to bust frat-house collections of answers, but varying questions year from year probably helps even more in that regard. Similarity between complex projects in the class sizes that were typical at my university (in classes advanced enough to have complex answers) was pretty easy to spot manually. Maybe edit-distance software is useful there to put some weight behind accusations?