Hacker News new | ask | show | jobs
by hanselot 954 days ago
Though this is exactly what happened. The initial test was ran on a model that "Cheated" (aka has memorized the answers). The second test was run on a model that didn't "Cheat" as much, yet still got only 2% less score. So, the question is not resolved really. How much did the first model cheat, and how much did the second? If the second model "cheats" less, then it wins.

Also, I don't understand your obsession with the word cheating. If you have solved a problem before on a different website and solve it again, did you cheat? Or did you just use your brain to store the solution for later?

2 comments

> Also, I don't understand your obsession with the word cheating.

It's all about the rule set yea. Since the rule set is not defined, technically nothing is cheating. I just interpret the rule set as "can it code?" and for this rule set, it seems to me that it's cheating.

> How much did the first model cheat, and how much did the second? If the second model "cheats" less, then it wins.

They both cheated 100%. Because they both never saw the problem. AT ALL. They just saw the title and the name of the website.