Hacker News new | ask | show | jobs
by matusp 937 days ago
Another issue is that making things reproducible costs you time and that is exactly what most researchers do not have. For example, many ML papers have code that is just a barely working Jupyter notebook. To make it reproducible you would have to create a reproducible environment, package the data, and prepare scripts that would rerun all the experiments you have done. That can take several weeks, but it will not increase the chance of acceptance for your paper at all.
4 comments

More precisely, making things reproducible after the fact costs you significant time - there are tools for reproducible setups that take maybe an hour (at most) to setup upfront, after which it takes very little effort to do your work within that framework and keep things reproducible (for eg. Julia has DrWatson, DataDeps, etc., I'd be surprised if Python doesn't have equivalents).

The problem is knowing upfront which of your work would need to be reproducible, or having the discipline to do all your hacking starting from such reproducible setups.

But Julia and Python tools aren't enough. The whole environment has to be reproducible. So many python libraries themselves take shortcuts which work on the current Ubuntu or current state of the web, but will fail to build later by the time someone tries to reproduce the result. Shipping a container just hides the implicit dependencies and assumptions. People need to be packaging for Guix en masse for reproducibility to be feasible. Until then, "reproducibility" is just another lie people are telling themselves and others to try and get ahead in their rat races.
So you say "julia and python tools aren't enough" but then proceed to only talk about Python and say a bunch of stuff that is completely inapplicable the Julia.

Do you know much about how reproducibility is approached in Julia? Maybe hold off on calling it a lie if you're not experienced in what you're talking about.

I have asked about Julia's reproducibility story on the Guix mailing list in the past, and at the time Simon Tournier didn't think it was promising. I seem to recall Julia itself didnt have a reproducible build. All I know now is that github issue is still not closed.

https://github.com/JuliaLang/julia/issues/34753

"reproducible build" in this sense has nothing to do with scientific reproducibility. That issue is about hash-verifiability for the sake of security, and how some autogenerated random paths included in the binary affects that.

Scientific reproducibility requires only that versioned binaries be functionally equivalent if they have the same version, which is quite independent of this and certainly exists in Julia.

Would love a link to the Guix mailing list discussion, if you can dig it up.

I agree with your first sentence, but saying people are fooling themselves and being overoptimistic (by telling themselves lies) is very different from "calling it a lie" (i.e. intentionally deceiving others). That seems like an unnecessarily negative interpretation of what they said. Even if you disagree with it, that does not deserve such a harsh response.
Maybe the cause is funding sources that fund researchers publishing too often, and not funding other researchers to double check their work
No. Several weeks is the time it takes to learn and master Docker.

About two hours is the cumulative time one must cater to the Dockerfile for a 3 weeks project.

But it requires institution insisting on reproducibility, and fostering best practices to make it even easier for the researchers to be compliant.

I get it that reproducibility can be quite hard for biology. But ML cannot be taken as an example of a hard problem.

I agree that docker is great. But docker solves only one of the problems mentioned above (env) and even that solution does not work for some teams that run their experiments in GPU clusters where docker is not supported.
Perhaps the core issue is that academia excels at being a textbook case of goodhart's law. If/when reproducibility became a target then the academic system would/will likely make an equally bad mess as it has with its current targets.