Hacker News new | ask | show | jobs
by blackkettle 4185 days ago
i have started using Docker for this kind of stuff. you can build an isolated environment for your software and experiments, where you can absolutely guarantee that anyone who wants to can easily replicate your experiments, since they don't need to create the environment themselves - just pull the docker image for conference-paper# and run the scripts.

if the experimental data is proprietary, or you want to keep it separate, you can set a mount point for it in the lxc.

3 comments

"anyone who wants to can easily replicate your experiments"

Replicate the experiments, or just repeat the results?

How about, verify the published code (!) even produces the published results?
the stuff i work on is in the area of machine learning, so most published work involves one or more well-known data sets.

i would argue that the two are the same in this case.

the lxcs provide all the source code i write [plus of course the compiled version], all third-party libraries, and all scripts used to run and evaluate the experiments, and the data as well, where that is permitted.

it's still not perfect, but for my area, i honestly think it is the best, and most accountable way to do things that i have seen.

And hopefully one or more not-so-well-known, local data sets to check that the results are actually as claimed?
well, the idea is that you should be able to run any data set you have, and get good results relative to other solutions. but that is an open question with any research.

the point of the docker/lxc aspect is to provide a simple working environment to facilitate replication and validation.

so in comparison to the status quo, which is basically 'write a paper, include some high level equations, and results', i think this is a step forward in a better direction.

+1 for this. There is so much more to repeatability beyond "When I click run, does it give me the same number again?"
if it's an entirely computational experiment, which is not uncommon, then 'replicate the experiments' is correct.
I tend to worry that an error in the code will be baked into the theory for generations.

I don't deal with much scientific code myself, but at one point I dealt with a proof-of-concept cryptographic library from a reasonably well-respected researcher. The code behaved correctly from the outside, but when I dug into it, deviated wildly from the published specification.

Recent Eurpoean economic policy was based on a paper that relied on an Excel formula error http://theconversation.com/economists-an-excel-error-and-the...

It only lasted a few years, but I find the idea of exiting long lasting research founded on bad code a to be very real possibility.

A distressing number of runs on our HPC system simply aren't reproducible twice in a row anyway. They get repeated until, or in the hope that, they don't deadlock or segv, not that users typically believe in deadlock. They aren't debugged -- it's blamed on supposed system problems, not the code -- and it doesn't seem to worry the people publishing results from them. I doubt our users are unique.

Even for decent code, docker is being over-sold for this sort of thing. Serious large-scale calculations, in particular, simply aren't hardware-independent in practice. Consider a 1024-core PSM MPI job with Haswell-specific code or requiring some GGPU, or a 128-core, 2TB SMP one; you can't run them anywhere. Even if you can package and run in docker at another site, if you don't get the "right" results, what do you do about it if you don't have source?

source code should also be included as a matter of course...

i don't think it is an oversell, in the sense that it is still unusual to include source code and experimental setups [at least in my field]. a replicable environment with included source code is a large step forward.

sad as that might be.

This reminds me of Phillip Guo's work; maybe this one?

http://pgbovine.net/publications/CDE-create-portable-Linux-p...

also it hadn't occurred to me that this might be something interesting to even publish a paper about. so thanks for that too [assuming someone else hasn't already done this too..]

edit: well no surprise there i guess! http://www.nextflow.io/blog/2014/nextflow-meets-docker.html

cool, i had not heard of this. i just started using docker for work and came to the conclusion that it was epically well-suited to this purpose as well. i think docker might be even nicer, since there is no special tools required [but ill definitely take a closer look at this work]
Guo's work is a bit old, docker is a very new thing.