|
|
|
|
|
by argonaut
3468 days ago
|
|
Code is only easy to replicate when they give you or publish the code. This is not true of many ML papers. In the words of the second author, a 3% accuracy difference on this particular dataset is a "huge difference." In fact, dismissing a 3% difference is actually reflective again of how delicate understanding ML results is. A jump from 90% accuracy to 93% accuracy is massively different than a jump from 50 to 53% or even a jump from 80 to 83%. Almost nobody writes tests for experiment code. You're proving my point :) |
|
No. Graduate ML students can implement the papers they read w/o a reference implementation - just search github. As I said, I implemented PV w/o the reference code. Many others did the same even before I did.
> dismissing a 3% difference is actually reflective again of how delicate understanding ML results
Not really. I understand very well results in ML (Otherwise I would be a pretty incompetent graduate student). But does a 3% increase on say imdb translate to an increase on a another text classification task? possibly - but usually not. If it does translate well across text classification datasets, you will almost certainly see the different datasets and the results in the paper.
> Almost nobody writes tests for experiment code. You're proving my point :)
It's a good point but in my experience, the kinds of mistakes that I've usually found with my own or others experimental code would not be possible to catch with a software test. Only with analysis of the results do they become obvious.