| > These programs tend to be both so sloppily written and so central to the results that it’s contributed to a replication crisis, or put another way, a failure of the paper to perform its most basic task: to report what you’ve actually discovered, clearly enough that someone else can discover it for themselves. This is the crux of the of the problem IMHO - at least for the fields I study (AI/ML). Replicating the results in papers I read, is way harder than it needs be, i.e. for these fields it should just be fire up a jupyter notebook and download the actual dataset they used (much harder than it seems to actually get your hands on). Very few papers actually contain links to all of this in a final polished manner so that it's #1 understandable and #2 repeatable. Honestly, I'd much rather have your actual code and data that you used to get your results than read through the research paper if I had to choose (assuming the paper is not pure theory) - but instead there is a disproportionate focus on paper quality over "project quality" at least IMHO. I don't really know what the solution is since apparently most academics have been perfectly fine with the status quo. I feel like we could build a much better system if we redefined our goals, since I don't think the current system is optimal for disseminating knowledge or finding and fixing mistakes in research or even generally working in a fast iterative process. |
It's like best practices for computer security -- always strive to minimize the attack surface. :) Without source code there is much less stuff to criticize!