Hacker News new | ask | show | jobs
by jnxx 2131 days ago
Konrad Hinsen is an expert in molecular bioinformatics and also has significantly contributed to Numerical Python, for example, and has extensively published around the topic of reproducible science and algorithms - see his blog.

The fact that he might favor different solutions from you does not mean that he is pushing some kind of hidden agenda.

If you think that Common Workflow Language is a better solution, you are free to explain in a blog why you think this.

Are you saying that the reproductive challenge poses a difficulty to Common Workflow Language? If this is so, would that not rather support Hinsen's point - without implying that what he suggests is already a perfect solution?

2 comments

I never said that Konrad Hinsen's agenda was hidden; in fact, it's not at all hidden (which is why I linked the abstract). It's just that this context isn't at all clear in the Nature write-up, and it's relevant to take into account.

I haven't taken the time to seriously contemplate the merits of CWL vs Leibniz, although my gut instinct is that we don't really need another domain-specific language for science given the profusion of such languages that already exist (Mathematica, Maple, R, MATLAB, etc). That's the extent of my bias, but again, it's a gut instinct and not a comprehensive well-reasoned argument against Leibniz.

I never answered your last question so here goes:

> Are you saying that the reproductive challenge poses a difficulty to Common Workflow Language?

I don't actually understand how the reproducibility challenge undermines the validity of using CWL / flow-based programming as an approach to promoting reproducible analyses. There certainly wasn't anything in the article that made me think that CWL was challenged, but Hinsen explicitly called out CWL in the abstract, which implies that for some reason he thinks, a priori, that it's a non-solution. He never justifies this implied assumption further, and as near as I can tell, none of the attempted replications used a flow-based language.

If Hinsen really aimed to argue against the viability of CWL/flow-based programming as an approach to reproducibility, he would have done a systematic comparison of historical analyses that used a flow-based system (like National Instruments' Labview or Prograph) vs analyses that are more similar to the approach that he seems to favor (i.e., analyses using Mathematica or Maple).

While I find the challenge interesting to follow, and the retrocomputing geek in me finds it fun, I don't actually understand what it really accomplished other than being a fun diversion. Assuming that an analysis was written in a Turing-complete language and you didn't use non-deterministic algorithms, you should theoretically be able to reproduce the results exactly on modern hardware, and using non-deterministic algorithms I would imagine that a result would be "close enough" within some kind of confidence interval. You may need to go to great lengths (in terms of emulating instruction sets, ripping tapes, etc), but I think a visit to any retrocomputing festival or computer history museum would have made that pretty obvious from the outset.