|
|
|
|
|
by ehsankia
2432 days ago
|
|
As long as they're not training on the test data, and they're not submitting hundreds of submissions tweaking parameters trying to improve their score, I don't see what the problem is. If the algorithm can do a great job at classifying hundreds of new test cases it has never seen, and it isn't over-fitted, then that means it is good at that specific task. Of course the task itself may or may not be useful, and you can have some meta discussion about what "understanding language" is, but the computer definitely is doing a super human job at that given task. |
|
These rankings, if real, should be in constant flux.