| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by roel_v 3908 days ago

"<snip> then it should be possible to program that model across platforms and software and still yield consistent results with different sources of random input data."

This hits too close to home for me not to comment on. I do basically exactly this - redevelop models into production-quality code for broader deployment. I do this for 'closed' models as well (code that researchers do not have available for download from a website, for whatever reasons - mostly because they don't care, which is fine). Models being 'closed' this way does not make them 'black boxes' or 'not reproducible' - whatever the code does, needs to be described in the paper(s) anyway (the concepts, not the implementation details).

The way to do a baseline verification of the implementation of models is by having minimal synthetic data sets and doing unit tests on them. Usually people develop their model based on their full 20000-observation or whatever data set, with numbers with 15 digits etc. - the only way to spot mistakes in such an implementation is if they are several orders of magnitude off.

I once found a calculation in some Fortran code that mistook kilometers for meters (or the other way around, can't remember; either way, the result was that one component of the model was off by a factor 1000). This hadn't been discovered in 10+ years, by many users, some of whom (much to my horror) actually used this model to advise on subsidies for certain sectors. Now, it's not that the results where completely unreasonable, because someone would have noticed; it's the small mistakes that are the worst, especially when they are non-linear. Despite that and many examples like it that I have encountered, it proves to be nigh impossible to change software development hygiene of most researchers.