Hacker News new | ask | show | jobs
by izacus 530 days ago
I spent a chunk of my career working on productionizing code from ML/AI papers and huge part of them are outright not reproducible.

Mostly they lack critical information (missing chosen constants in equations, outright missing information on input preparation or chunks of "common knowledge algorithms"). Those that don't have measurements that outright didn't fit the reimplemented algorithms or only succeeded in their quality on the handpicked, massaged dataset of the author.

It's all worse than you can imagine.

1 comments

That’s the difference between truly new approaches to modelling an existing problem, or coming up with a new problem. No set of a bit different results or missing exact hyperparameter settings really invalidates the value of the aforementioned research. If the math works, and is a nice new point of view, its good. It may not even help anyone with practical applications right now, but may inspire ideas further down the line that do make the work practicable, too.

In contrast, if the main value of a paper is a claim that they increase performance/accuracy in some task by x%, then its value can be completely dependent on whether it actually is reproduceable.

Sounds like you are complaining about the latter type of work?

> No set of a bit different results or missing exact hyperparameter settings really invalidates the value of the aforementioned research.

If this is the case, the paper should not include a performance evaluation at all. If the paper needs a performance evaluation to prove its worth, we have every right to question the way that evaluation was conducted.

I don't think theres much value in theoretical approaches that lack important derivation data either, so no need to try to split the papers like this. The academic CS publishing is flooded with bad quality papers in any case.