Hacker News new | ask | show | jobs
by jacques_chester 4768 days ago
I've seen cheating-detection programs that use tree diff scores to compare code from different students.

The theory goes that students who share solutions will probably change the variable names, do some reformatting etc, which would fool a text diff. But the actual structure of the AST will be the same or similar.

So if you find close matches, you inspect them more closely.

Some quick Googling reveals that plagiarism detection using tree comparisons is a common idea.

1 comments

Yup, when I thought "this looks neat!" I didn't think it was very novel, the idea is very clear once you are actually looking at cheated code (IIRC my brother in law told me a few years ago - later an I wrote my code, which was written before meeting him - he had used a similar method years earlier in another university). But I didn't bother much with looking for existing solutions: this was just soo fun! I love reinventing wheels for fun/learning :)