Hacker News new | ask | show | jobs
by izacus 533 days ago
What you're deliberately ignoring is that omitting important information is material to a lot of papers because the methodology was massaged into desired results to created publishable content.

It's really strange seeing how many (academic) people will talk themselves into bizarre explanations for a simple phenomenon of widespread results hacking to generate required impact numbers. Occams razor and all that.

1 comments

If it is massaged into desired results, then it will be invalidated by facts quite easily. Inversely, obfuscating things is also easy if you just provide the whole package and just say "see, you click on the button and you get the same result, you have proven that it is correct". No providing code means that people will redo their own implementation and come back to you when they will see they don't get the same results.

So, no, no need to invent that academics are all part of this strange crazy evil group. Academics are debating and are being skeptical of their colleagues results all the time, which is already contradictory to your idea that the majority is motivated by frauding.

Occams razor is simply that there are some good reasons why code is not shared, going from laziness to lack of expertise on code design to the fact that code sharing is just not that important (or sometimes plainly bad) for reproducibility, no need to invent that the main reason is fraud.

Ok, that's a bit naive now. The whole "replication crisis" is exactly the term for bad papers not being invalidated "easily". [1]

Beacuse - if you'd been in academia - you'd find out that replicating papers isn't something that will allow you to keep your funding, your job and your path to next title.

And I'm not sure why did you jump to "crazy evil group" - noone is evil, everyone is following their incentives and trying to keep their jobs and secure funding. The incentives are perverse. This willing blindness against perverse incentives (which appears both in US academia and corporate world) is a repeated source of confusion for me - is the idea that people aren't always perfectly honest when protecting their jobs, career success and reputation really so foreign to you?

[1]:https://en.wikipedia.org/wiki/Replication_crisis

That's my point: people here link the replication crisis to "not sharing the code", which is ridiculous. If you just click on a button to run the code written by the other team, you haven't replicated anything. If you review the code, you have replicated "a little bit" but it is still not as good as if you would have recreated the algorithm from scratch independently.

It's very strange to pretend that sharing the code will help the replication crisis, while the replication crisis is about INDEPENDENT REPLICATION, where the experience is redone in an independent way. Sometimes even with a totally perpendicular setup. The closer the setup, the weaker is the replication.

It feels like it's watching the finger who point at the moon: not understanding that replication does not mean "re-running the experiment and reaching the same numbers"

> noone is evil, everyone is following their incentives and trying to keep their jobs and secure funding

Sharing the code has nothing to do with the incentives. I will not loose my funding if I share the code. What you are adding on top of that, is that the scientist is dishonest and does not share because they have cheated in order to get the funding. But this is the part that does not make sense: unless they are already established enough to have enough aura to be believed without proofs, they will lose their funding because the funding is coming from peer committee that will notice that the facts don't match the conclusions.

I'm sure there are people who down-play the fraud in the scientific domain. But pretending that fraud is a good strategy for someone's career and that it is why people will fraud so massively that sharing the code is rare, this is just ignorance of the reality.

I'm sure some people fraud and don't want to share their code. But how do you explain why so many scientists don't share their code? Is that because the whole community is so riddled with cheaters? Including cheaters that happens to present conclusions that keep being proven correct when reproduced? Because yes, there are experiments that have been reproduced and confirmed and yet the code, at the time, was not shared. How do you explain that if the main reason to not share the code is to hide cheating?

I've spent plenty of time of my career doing exactly the type of replication you're talking about and easily the majority of CS papers weren't replicable with the methodology written down on the paper and on dataset that wasn't optimized and preselected by the papers author.

I didn't care about sharing code (it's not common), but independent implementation and comparison of ML and AI algorithms with purpose of independent comparison. So I'm not sure why you're getting so hung up on the code part: majority of papers were describing trash science even in their text in effort to get published and show results.

I'm sorry that the area you are exercising in is rotten and does not have the minimum scientific standard. But please, do not reach conclusion that are blatantly incorrect in areas you don't know.

The problem is not really "academia", it is that, in your area, the academic community is particularly poor. The problem is not really the "replication crisis", it is that, in your area, even before we reach the concept of replication crisis, the work is not even reaching the basic scientific standard.

Oh, I guess it is Occams Razor after all: "It's really strange seeing how many (academic) people will talk themselves into bizarre explanations for a simple phenomenon of widespread results hacking to generate required impact numbers". Occams Razor explanation: so many (academic) people will not talk about the malpractice because so many (academic) people work in an area where these malpractice are exceptional.

But what’s the point of the peer review process if it’s not sifting out poor academic work?

It reads as if your point is talking in circles. “Don’t blame academia when academia doesn’t police itself” is not a strong stance when they are portrayed as doing exactly that. Or, maybe more generously, you have a different definition of academia and it’s role.

I think sharing code can help because it’s part of the method. It wouldn’t be reasonable for omitting aspects of the methodology of a paper under the guise that replication should devise their own independent method. Explicitly sharing methods is the whole point of publication and sharing it is necessary for evaluating its soundness, generalizability, and limitations. izacus is right, a big part of the replication crisis is because there aren’t near as many incentives to replicating work and omitting parts of the method make this worse, not better.