Hacker News new | ask | show | jobs
by SomewhatLikely 1945 days ago
I managed to get 20/25 correct, and most the wrong ones occurred in the first 10. Once I realized I had to look closer than simple logic errors and slowed way down to read carefully my accuracy increased significantly. Reading the textual parts have the most clues: Comments that didn't match the functionality, variable names that seemed like a soup of related terms, even one typo (separated became separata), debug statements that didn't match the variables they should be outputting. Beyond that the overall cohesiveness was better with real code. When I was a TA in school I used to find that reading through correct code is usually much easier than through incorrect code because you have to start accommodating surreal hypotheticals in your head that might make the code all come together in the end. Disclaimer: I work on NLP projects and I'm quite familiar with the limitations and typical failure modes of transformer models.
1 comments

Yes it's not hard to spot the fake code if you take some time, I scored 10/10 on the first try. The GPT-2 code definitely 'looks real' but it doesn't make any sense most of the time, using unitialized variables, computing things but not using them, almost random comments that have no relation to the code, etc. The only snippets where I had to guess where those that just repeated many variations of the same thing as if they were machine-generated from some other input data (which sometimes makes sense to do in a 'real' project).

What I found more worrying is how terrible some of the real code was :D