| I have been in discussions about this with one of my friends working in academic materials research. Its amazing the amount of work today done by scientist at universities writing code without very basic software development tools. I'm talking opening their code in notepad, 'versioning' files by sending around zip files with numbers manually added to the end of the file name, etc. This doesn't even begin to scratch the surface of the 'reproducible results' problem. Often times, the software I've seen is 'rough' to be kind. Most times its not even possible to get the software running (missing some really specific library or some changes to a dependency which haven't been distributed) or its built for a super specific environment and makes huge assumptions on what can 'be assumed about the system.' This same software produces results which end up being published in journals. If any of these places had money to spend, I think there could be a valuable business in teaching science types how to better manage their software. Its really unfortunate that outside of a few core libraries (numpy, etc.) the default method is for each researcher to rebuild the components they need. I'm surprised about only 11% of results being reproducible. It seems lower then I'd expect. I agree we don't want to optimize for reproducibility, but obviously there is some problem here that needs to be addressed. |
I agree 100%. I recently quit my PhD so I still know a lot of people on the frontlines of science. One of these friends recently asked me to help them with a coding issue so they gave me an ssh login to group's server. I login and start reading the source.
It was all Fortran, with comments throughout like "C A major bug was present in all versions of this program dated prior to 1993." What bug, and of what significance for past results? Unknowable. As far as I can tell from the comments, the software has been hacked on intermittently by various people of various skill since at least 1985 without ever using source control or even starting a basic CHANGELOG describing the program's evolution. The README is a copy/paste of some old emails about the project. There are no tests.
So even though computer modeling projects should, in theory, be highly reproducible... it often seems like researchers are not taking the necessary steps to know what state their codebase was in at the time certain results were obtained.