Hacker News new | ask | show | jobs
by franga2000 530 days ago
I don't think anyone is saying it's not reproducible without code, it's just much more difficult for absolutely no reason. If I can run the code of a ML paper, I can quickly check if the examples were cherry-picked, swap in my own test or training set... The new technique or idea was still the main contribution, but I can test it immediately, apply it to new problems, optimise the performance to enable new use-cases...

It's like a chemistry paper for a new material (think the recent semiconductor thing) not including the amounts used and the way the glassware was set up. You can probably get it to work in a few attempts, but then the result doesn't have the same properties as described, so now you're not sure if your process was wrong or if their results were.

2 comments

More code should be released, but code is dependent on the people or environment that run it. When I release buggy code I will almost always have to spend time supporting others in how to run it. This is not what you want to do in Proof of concept to prove an idea.

I am not published but I have implemented a number of papers to code, it works fine (hashing, protocols and search mostly). I have also used code dumps to test something directly. I think I spend less time on code dumps, and if I fail I give up easier. That is the danger you start blaming the tools instead of how good you have understood the ideas.

I agree with you that more code should be released.. It is not a solution for good science though.

Sharing the code may also share the incorrect implementation biases.

It's a bit like saying that to help reproduce the experiment, the experimental tools used to reach the conclusion should be shared too. But reproducing the experiment does not mean "having a different finger clicking on exactly the same button", it means "redoing the experiment from scratch, ideally with a _different experimental setup_ so that it mitigates the unknown systematic biases of the original setup".

I'm not saying that sharing code is always bad, you give examples of how it can be useful. But sharing code has pros and cons, and I'm surprised to see so often people not understanding that.

If they don't publish the experimental setup, another person could use the exact same setup anyway without knowing. Better to publish the details so people can actually think of independent ways to verify the result.
But they will not make the same mistakes. If you ask two persons to build a software, they can use the same logic and build the same algorithm, but what are the chances they will do exactly the same bugs.

Also, your argument seems to be "_maybe_ they will use the exact same setup". So it already looks better than the solution where you provide the code and they _will for sure_ use the exact same setup.

And "publish the details" corresponds to explain the logic, not share the exact implementation.

Also, I'm not saying that sharing the code is bad, but I'm saying that sharing the code is not the perfect solution and people who thinks not sharing the code is very bad are usually not understanding what are the danger of sharing the code.

Nobody said sharing the code "is the perfect solution". Just that sharing the code is way better and should be commonplace, if not required. Your argument that not doing so will force other teams to do re-write the code seems unrealistic to me. If anyone wants to check the implementation they can always disregard the shared code, but having it allows other, less time-intensive checks to still happen: like checking for cherry-picked data, as GP suggested, looking through the code for possible pitfalls etc. Besides, your argument could be extended to any specific data the paper presents: why publish numbers so people can get lazy and just trust them? Just publish the conclusion and let other teams figure out ways to prove/disprove it! - which is (more than) a bit ridiculous, wouldn't you say?
> Just that sharing the code is way better

And I disagree with that and think that you are overestimating the gain brought by sharing the code and are underestimating the possible problems that sharing the code bring.

At CERN, there are 2 generalistic experiments, CMS and ATLAS. The policy is that people from one experiment are not allowed to talk of undergoing work with people from the other. You notice that they are officially forbidden, not "if some want to discuss, go ahead, others may choose to not discuss". Why? Because sharing these details is ruining the fact that the 2 experiments are independent. If you hear from your CMS friend that they have observed a peak at 125GeV, you are biased. Even if you are a nice guy and try to forget about it, it is too late, you are unconsciously biased: you will be drawn to check the 125GeV region and possibly notice a fluctuation as a peak while you would have not noticed otherwise.

So, no, saying "I give the code but if you want you may not look at it" is not enough, you will still de-blind the community. As soon as some people will look at the code, they will be biased: if they will try to reproduce from scratch, they will come up with an implementation that is different from the one they would have come up with without having looked at the code.

Nothing too catastrophic either. Don't get me wrong, I think that sharing the code is great, in some cases. But this picture of saying that sharing the code is very important is just misunderstanding of how science is done.

As for the other "specific data", yes, some data is better not to share too if it is not needed to reproduce the experiment and can be source of bias. The same could be said about everything else in the scientist process: why sharing the code is so important, and not sharing all the notes of each and every meetings? I think that often the person who don't understand that is a software developer, and they don't understand that the code that the scientist creates is not the science, it's not the publication, it's just the tool, the same way a pen and a piece of paper was. Software developers are paid to produce code, so code is for them the end goal. Scientists are paid to do research, and code is not the end goal.

But, as I've said, sharing the code can be useful. It can help other teams working on the same subject to reach the same level faster or to notice errors in the code. But in both case, the consequence is that these others teams are not producing independent work, and this is the price to pay. (and of course, they are layers of dependence: some publications tend to share too much, other not, but it does not mean some are very bad and others very good. Not being independent is not the end of the world. The problem is when someone considers that sharing the code is "the good thing to do" without understanding that)

What you're deliberately ignoring is that omitting important information is material to a lot of papers because the methodology was massaged into desired results to created publishable content.

It's really strange seeing how many (academic) people will talk themselves into bizarre explanations for a simple phenomenon of widespread results hacking to generate required impact numbers. Occams razor and all that.

If it is massaged into desired results, then it will be invalidated by facts quite easily. Inversely, obfuscating things is also easy if you just provide the whole package and just say "see, you click on the button and you get the same result, you have proven that it is correct". No providing code means that people will redo their own implementation and come back to you when they will see they don't get the same results.

So, no, no need to invent that academics are all part of this strange crazy evil group. Academics are debating and are being skeptical of their colleagues results all the time, which is already contradictory to your idea that the majority is motivated by frauding.

Occams razor is simply that there are some good reasons why code is not shared, going from laziness to lack of expertise on code design to the fact that code sharing is just not that important (or sometimes plainly bad) for reproducibility, no need to invent that the main reason is fraud.

Ok, that's a bit naive now. The whole "replication crisis" is exactly the term for bad papers not being invalidated "easily". [1]

Beacuse - if you'd been in academia - you'd find out that replicating papers isn't something that will allow you to keep your funding, your job and your path to next title.

And I'm not sure why did you jump to "crazy evil group" - noone is evil, everyone is following their incentives and trying to keep their jobs and secure funding. The incentives are perverse. This willing blindness against perverse incentives (which appears both in US academia and corporate world) is a repeated source of confusion for me - is the idea that people aren't always perfectly honest when protecting their jobs, career success and reputation really so foreign to you?

[1]:https://en.wikipedia.org/wiki/Replication_crisis

That's my point: people here link the replication crisis to "not sharing the code", which is ridiculous. If you just click on a button to run the code written by the other team, you haven't replicated anything. If you review the code, you have replicated "a little bit" but it is still not as good as if you would have recreated the algorithm from scratch independently.

It's very strange to pretend that sharing the code will help the replication crisis, while the replication crisis is about INDEPENDENT REPLICATION, where the experience is redone in an independent way. Sometimes even with a totally perpendicular setup. The closer the setup, the weaker is the replication.

It feels like it's watching the finger who point at the moon: not understanding that replication does not mean "re-running the experiment and reaching the same numbers"

> noone is evil, everyone is following their incentives and trying to keep their jobs and secure funding

Sharing the code has nothing to do with the incentives. I will not loose my funding if I share the code. What you are adding on top of that, is that the scientist is dishonest and does not share because they have cheated in order to get the funding. But this is the part that does not make sense: unless they are already established enough to have enough aura to be believed without proofs, they will lose their funding because the funding is coming from peer committee that will notice that the facts don't match the conclusions.

I'm sure there are people who down-play the fraud in the scientific domain. But pretending that fraud is a good strategy for someone's career and that it is why people will fraud so massively that sharing the code is rare, this is just ignorance of the reality.

I'm sure some people fraud and don't want to share their code. But how do you explain why so many scientists don't share their code? Is that because the whole community is so riddled with cheaters? Including cheaters that happens to present conclusions that keep being proven correct when reproduced? Because yes, there are experiments that have been reproduced and confirmed and yet the code, at the time, was not shared. How do you explain that if the main reason to not share the code is to hide cheating?

I've spent plenty of time of my career doing exactly the type of replication you're talking about and easily the majority of CS papers weren't replicable with the methodology written down on the paper and on dataset that wasn't optimized and preselected by the papers author.

I didn't care about sharing code (it's not common), but independent implementation and comparison of ML and AI algorithms with purpose of independent comparison. So I'm not sure why you're getting so hung up on the code part: majority of papers were describing trash science even in their text in effort to get published and show results.