Hacker News new | ask | show | jobs
by palvaro 4185 days ago
> the problem is that academia rewards producing papers, not stable software libraries.

isn't this the good thing about academia? aren't you kids getting paid the big bucks to write and maintain the software libraries, while we work on novel problems for pennies?

3 comments

I am an academic myself! Aside from that - it's actually a bad thing, poor software quality is incredibly harmful when trying to create reproducible research.

A few people are fighting this (Titus Brown, etc) but it's mostly swimming against the tide of bad incentives.

i have started using Docker for this kind of stuff. you can build an isolated environment for your software and experiments, where you can absolutely guarantee that anyone who wants to can easily replicate your experiments, since they don't need to create the environment themselves - just pull the docker image for conference-paper# and run the scripts.

if the experimental data is proprietary, or you want to keep it separate, you can set a mount point for it in the lxc.

"anyone who wants to can easily replicate your experiments"

Replicate the experiments, or just repeat the results?

How about, verify the published code (!) even produces the published results?
the stuff i work on is in the area of machine learning, so most published work involves one or more well-known data sets.

i would argue that the two are the same in this case.

the lxcs provide all the source code i write [plus of course the compiled version], all third-party libraries, and all scripts used to run and evaluate the experiments, and the data as well, where that is permitted.

it's still not perfect, but for my area, i honestly think it is the best, and most accountable way to do things that i have seen.

And hopefully one or more not-so-well-known, local data sets to check that the results are actually as claimed?
well, the idea is that you should be able to run any data set you have, and get good results relative to other solutions. but that is an open question with any research.

the point of the docker/lxc aspect is to provide a simple working environment to facilitate replication and validation.

so in comparison to the status quo, which is basically 'write a paper, include some high level equations, and results', i think this is a step forward in a better direction.

+1 for this. There is so much more to repeatability beyond "When I click run, does it give me the same number again?"
if it's an entirely computational experiment, which is not uncommon, then 'replicate the experiments' is correct.
I tend to worry that an error in the code will be baked into the theory for generations.

I don't deal with much scientific code myself, but at one point I dealt with a proof-of-concept cryptographic library from a reasonably well-respected researcher. The code behaved correctly from the outside, but when I dug into it, deviated wildly from the published specification.

Recent Eurpoean economic policy was based on a paper that relied on an Excel formula error http://theconversation.com/economists-an-excel-error-and-the...

It only lasted a few years, but I find the idea of exiting long lasting research founded on bad code a to be very real possibility.

A distressing number of runs on our HPC system simply aren't reproducible twice in a row anyway. They get repeated until, or in the hope that, they don't deadlock or segv, not that users typically believe in deadlock. They aren't debugged -- it's blamed on supposed system problems, not the code -- and it doesn't seem to worry the people publishing results from them. I doubt our users are unique.

Even for decent code, docker is being over-sold for this sort of thing. Serious large-scale calculations, in particular, simply aren't hardware-independent in practice. Consider a 1024-core PSM MPI job with Haswell-specific code or requiring some GGPU, or a 128-core, 2TB SMP one; you can't run them anywhere. Even if you can package and run in docker at another site, if you don't get the "right" results, what do you do about it if you don't have source?

source code should also be included as a matter of course...

i don't think it is an oversell, in the sense that it is still unusual to include source code and experimental setups [at least in my field]. a replicable environment with included source code is a large step forward.

sad as that might be.

This reminds me of Phillip Guo's work; maybe this one?

http://pgbovine.net/publications/CDE-create-portable-Linux-p...

also it hadn't occurred to me that this might be something interesting to even publish a paper about. so thanks for that too [assuming someone else hasn't already done this too..]

edit: well no surprise there i guess! http://www.nextflow.io/blog/2014/nextflow-meets-docker.html

cool, i had not heard of this. i just started using docker for work and came to the conclusion that it was epically well-suited to this purpose as well. i think docker might be even nicer, since there is no special tools required [but ill definitely take a closer look at this work]
Guo's work is a bit old, docker is a very new thing.
I agree that this is a serious issue in our community, but I am not sure I agree that stable libraries <=> reproducible experiments.
If those experiments involve automated data collection or computer models, then stable data collection or modeling libraries would be kind of important for reproducing them.
Try to reproduce the analysis published in a paper when all you have is a matlab script with one letter variable names and zero comments :)
If you're running someone else's code, imo that's not reproduction in the first place, just like re-running an experiment using the original experimenter's preparations and lab apparatus is not what's usually meant by "reproducing" an experiment. Too much undocumented stuff can creep in if you don't independently reproduce, with independent apparatus, preparations of samples, etc. (I don't think having someone's code is useless, and it can be especially useful for elaborating on the original experiment, but I would purposely avoid looking at it if I were aiming for an independent reproduction.)
You are always running someone else's code. It starts the moment you boot up your machine.
Not if you bootstrap from the silicon up.
If you even have access to the source code, detailed algorithm, or even a matlab script. It's either a citation or a plain old equation.

Often times, and especially from what I've seen in the computer vision papers, the authors merely state what algorithm they used, and how they combined it with their novel method. And that algorithm is in another paper, by the way, probably by the same author. Definitely not the implementation you're working with, too, if you have it.

It's almost as if they need a combined repository. And each paper that presents a novel algorithm, or implementation of an existing one, is a "changeset" or "branch". And the citations to algorithm's used in a paper would be changeset hashes, or branch names. Hey, it's the first thing that popped into mind for me to solve this horrendous problem.

I certainly agree with this. The computer vision field is awash with papers proposing a 'new' algorithm which is then poorly compared to some select group of existing techniques under criteria chosen by the author. A paper is a very poor substitute for the code itself and really it should be mandatory for code to be submitted with the paper, especially in a field such a computer vision where the entire experimental apparatus could be packed into a zip file. That way any other group could take the code and independently evaluate the technique without reimplementation. Indeed my own experience is that often the maths described in the paper is not necessarily responsible for all the results! As you say this could even become the start of collaborative improvement.

Unfortunately my experience is that too many academic groups believe that their source code is the route to untold riches.

Better than nothing. (Been there, done that).
I agree with you here, but I think 'stable libraries' is perhaps a good target for a few reasons, right now the culture isn't just bad code, it's "There is no advantage or benefit to showing your code". I would say a difference between computer scientists and programmers is that frequently the work isn't just the code, but still, nurturing something like an open-source scientific community would accelerate a lot of learning.
The bad thing is academia is badly paid so are the support staff which is why I left a world class RnD organisation to work in commercial software.

have you read Cryptnomicon look at how Randy's first job at a University is described.

Prestige should accumulate to anyone who does good work, of whatever kind. Stable software libraries can help science as much as producing papers (if nothing else, because of their effect on future production of papers!).