|
I will not answer to everything because what's the point. Yes, sharing the code can be one way to find bugs, I've said that already. Yes, sharing the code can help bootstrap another team, I've said that already. What people don't realize is that reproducing from scratch the algorithm is also very very efficient. First, it's arguably a very good way to find bugs: if the other team does not have the exact same number as you, you can pinpoint exactly where you have diverged. When you find the reason, in the large majority of the case, it totally passed through several code reviewer. Reading a code thinking "does it make sense" is not an easy way to find bug, because bugs are usually in place where the code of the original author looked good when read. And secondly, there is a contradiction in saying "people will study the code intensively" and "people will go faster because they don't have to write the code". > Remember the title is "one of my papers got declined today" Have you even read what Tao says? He explains that he himself have rejected papers and has probably generated similar apparently paradoxical situations. His point is NOT that there is a problem with paper publication, it is that paper rejection is not such a big deal. For the rest, you keep mixing up "peer review", "code sharing", "replication crisis", ... and because of that, your logic just make 0 sense.
I say "bad paper that turns out to have errors (involuntary or not) are anecdotal" and you answer "11% of the biomedical publication have replication problem". Then when I ask you to give example where the replication crisis was avoided by sharing the data, you talk about bad papers that turns out to have errors (involuntary or not). And, yes, I used CERN as an example because 1) I know it well, 2) if what you say is correct, how on hell CERN is not bursting with fire right now? You are pretending that sharing code or sharing data is a good idea and part of good practice. If it is true, how do you explain that CERN forbid it and still is able to generate really good papers. According to you, CERN would even be an exception where replication crisis, bad paper and peer-review problem is almost existent (and therefore I got the wrong idea). But if it is the case, how do you explain that: despite not doing what you pretend will help avoiding those, CERN does BETTER?! But by the way, at uni, I became very good friend with a lot of people. Some of them scientists in other discipline. We regularly have this kind of discussion because it is interesting to compare our different world. The funny part is that I did not really think of how sharing the code or the data is not such a big deal after (it still can be good, but it's not "the good practice"), I realise it because another person, a chemist, mentioned it. |
This is where we differ. Especially if the author shares neither the data or the code, because you can never truly be sure it's a software bug or a data anomaly or a bad method or outright fraud. So you can end up burning tremendous amounts of time investigating all those avenues. That statement (as well as others about how trivial replication is) makes me think you don't actually try to replicate anything yourself.
>there is a contradiction in saying "people will study the code intensively" and "people will go faster because they don't have to write the code".
I never said "people will go faster" because they don't have to write the code. Maybe you're confusing me with another poster. You were the one who said sharing code is worthless because people can "click on the button and you get the same result". My point, and maybe this is where we differ, is that for the ultimate goal is not to create the exact same results. The goal I'm after is to apply the methodology to something else useful. That's why we share the work. When it doesn't seem to work, I want to go back to the original work to figure out why. The way you talk about the publication process tells me you don't do very much of this. Maybe that's because of your work at CERN is limited in that regard, but when I read interesting research I want to apply it to different data that are relevant to the problems I'm trying to solve. This is the norm outside of those who aren't studying the replication crisis directly.
>I say "bad paper that turns out to have errors (involuntary or not) are anecdotal"
My answer was not conflating peer-review and code sharing and replication (although I do think they are related). My answer was to give you researchers who work in this area because their work shows it is far from anecdotal. My guess is you didn't bother to look it up because you've already made up your mind and can't be bothered.
>I ask you to give example where the replication crisis was avoided by sharing the data, you talk about bad papers that turns out to have errors
Because it's a bad question. A study that is replicated using the same data is "avoiding the replication crisis". Did you really want me to list studies that have been replicated? Go on Kaggle or Figshare or Genbank if you want example of datasets that have been used (and replicated), like CORD-19 or NIH-dbGaP or World Values Survey or any host of other datasets. You can find plenty of published studies that use that data and try to replicate them yourself.
>how on hell CERN is not bursting with fire
The referenced authors talk about how physics is generally the most replicable. This is largely because they have the most controlled experimental setups. Other domains that do much worse in terms of replicability are hampered by messier systems, ethical considerations, etc. that limit the scientific process. In the larger scheme of things, physics is more of an anomaly and not a good basis to extrapolate to the state of affairs for science as a whole. I tend to think you being in a bubble there has caused you to over-extrapolate and have too strong of a conclusion. (You should also review the HN guidelines that urge commenters to avoid using caps for emphasis)
>"sharing the code...but it's not "the good practice""
I'm not sure if you think sharing a single unsourced quip is convincing but, your anecdotal discussion aside, lots of people disagree with you and your chemist friend. Enough so that it's become a more and more common practice (and even requirement in some journals) to share data and code. Maybe that's changed since your time at uni, and probably for the better.