Hacker News new | ask | show | jobs
by cgiles 2358 days ago
Biology postdoc here, seconded.

But what really gets me is the disconnect between "most scientists agree there is a reproducibility crisis" and "most scientists believe most of the papers they read are basically true and reproducible". This was mentioned in the survey and it conforms to my informal experience of attitudes.

I do not know how you square that circle. Maybe, because of the pressures you mention, we are all supposed to engage in an informal convention of pretending to believe most previously published work is true, was done correctly, and is reproducible even if we know damn well how unlikely that is. I find it hard to do.

One day the public is going to cotton on to all of this. I cringe every time I hear extremely authoritative lectures on "what scientists say" about highly politicized public policy matters. These are not my fields, but if they are as prone to error, bias, irreproducibility, etc, as my own, I'd exercise some humility. It is one thing for we scientists to lie -- errr, project unwarranted confidence -- to each other for the sake of getting grants, but it is quite another to do it directly to the public.

But when the public does figure it out, what do you think will happen to funding? It will get tighter, and make the problem even worse. We need reform from within before the hammer falls, and quickly.

9 comments

I don't know that we can reform - if you spend too much time trying, the system will automatically cull you. Personally, I'm lucky enough to be at least slightly insulated from the true scale of the problem by working in theoretical/computational soft matter physics, where the costs/grants/impact factors are comparatively on the small end of things. Medicine and biology seem to be affected the most, for good reason - this is where the messiest problems intersect with the most interest from society at large, so you get the most acutely misaligned incentives.

That being said, communicating limits of applicability and degrees of certainty to a popular audience is hellishly difficult even if you're trying to be perfectly honest. Even in a hypothetical world where we've somehow fixed academia, this will always be a hard problem most scientists are ill equipped to tackle.

I bet a lot would change if the largest funding institutions publicly prioritized funding scientists who have a history of publishing reproducible work.
Sure - for some definition of "reproducible". Any metric you come up with that's easily measurable will be gamed into oblivion. Anything meaningful - like looking at dedicated reproduction studies, etc. - requires a complete change in how academia functions, because right now publishing a paper that just aims to reproduce (or fail to) a previous work is effectively impossible in most fields. Not to mention that given the incentive structure you'd be pretty crazy to even try.
It may well be some kind of law that every positive incentive creates at least one perverse incentive, but that doesn't mean positive incentives don't work or aren't worth trying.
I tend to argue that 10%-20% of a grant should go to reproducibility. This does two things:

1. It would help fund / ensure funding to several labs in the space.

2. It would help ensure reproducibility of results

Yes, it can be gamed, but generally it should be easier to reproduce results so 10%-20% of the funding to follow directions should be okay. Of course, this could lead to one group just constantly doctoring results to show something is not reproducible. In which case, a third lab would need to get some funding to check it out.

First they'd have to fund the reproducibility studies.
Well they shouldn’t have any trouble funding the first one...
They have plenty of trouble funding much of anything.
Completely agreed on all counts.

By "reform from within", I mainly meant the NIH, NSF, big funders, etc, need to pay more than lip service to this before Congress gets involved. Although there are people who have built careers on irreproducibility itself. Ioannidis for example. But that requires a lot of dedication and tends to piss a lot of people off.

Another counterintuitive possibility is that scientific publication could go more of the PLoS route and actually lower standards for initial publication -- and then rely more heavily on post-publication peer review. Then irreproducible papers would get publicly "downvoted". And conversely, papers that didn't seem useful enough at the time to make it into Nature, but turned out to be pathbreaking, would get the recognition they deserve.

Further incentives for publishing as much raw data as possible and, where applicable, code, would help too. The NIH has done a good job here. They require a lot of high-throughput datasets to be made available raw if they were collected with NIH funding. They provide hosting for this. It means people can go back and re-analyze it, and you don't have to trust the authors' analyses.

Full Disclaimer: I am CEO and founder of a software company that works on reproducibility.

The disconnect between "most scientists agree there is a reproducibility crisis" and "most scientists believe most of the papers they read are basically true and reproducible" has extended to every domain in my experience.

With the way the funding model currently works, my hypothesis is that doing more, completely reproducible studies will drown out the non-reproducible studies (and funding model) with reproducible studies (funders ultimately vote with their checkbooks).

That seems like a second logical disconnect and completely impossible if the cost model remained the same. However, the question is do studies operate in a way that is maximally efficient for the scientists or for the economy of science publication and knowledge dissemination. We, at MyIRE, believe that technology can tip the scales in the favor of scientists.

Read more about our platform and business plan if you're interested:

https://docs.myire.com

NB: your website looks like "My Ire" is that intentional?

https://www.merriam-webster.com/dictionary/ire

I'll also plug https://www.protocols.io/ and https://codeocean.com/ here --- might be interesting platforms for folks concerned about reproducibility to have on their radar.
> "most scientists agree there is a reproducibility crisis" and "most scientists believe most of the papers they read are basically true and reproducible". [...] I do not know how you square that circle.

I don't understand why these seem contradictory? To me the obvious (maybe naive?) reading is that scientist believe reproducibility is essential to the advancement of their fields, but also believe that being unable to reproduce a result can often be explained by their lack of sufficient information (or resources) to reproduce said results, rather than always assuming the other person must be lying (or just pretty darn lucky). I'm guessing this must've happened because they have actually come across such situations and observed many results that are indeed correct, but just difficult to reproduce. Why can't this explanation square that circle?

It suggests that the scientists are evaluating results using social proof.

In most fields that is fine; social proof is about as good as anything else if someone is studying literature. But the hard sciences are specifically about studying things that are best proven using technical over social proofs.

If social proof is being used excessively then there are grave questions about what the standard of technical proof is. We don't really know from the outside looking in, but the problem being highlighted is that there is technical evidence of a problem and scientists are proceeding based on social proof that there is not. Not a good sign, but also difficult to evaluate from outside the field.

However the incentive structures built around academics do look risky. Most papers and citations isn't a good incentive for high quality work. It incentivises low quality work and collusion.

That is correct. Even when reviewing papers, it is true. As a reviewer, you do not question, for instance, whether the authors did the experiments exactly as stated, or whether they tried to analyze the results 20 different ways until they found the way that looked best.

You take it on trust that they did these things correctly, and focus on whether their conclusions are justified from their data.

If reviewers take so much on trust, how much more so readers, then? There are very, very few actual standards of technical proof that are in play here.

Particularly when a paper says something that isn't particularly novel. If the results match "prior probabilities" from the literature, the paper will be believed without much question. If it doesn't, it will get more scrutiny. People quickly learn that it is easier to publish when your result fits the status quo.

Take a look at slide 10 of [1]. There are numerous examples like this where physical constants were first measured as a value, and gradually trended up over a long period of time towards today's "true" value. If the experiments were really independent, they would, generally, scatter randomly before converging on the true value. The fact that they did not, suggests investigators were using methods to "smooth" the difference between their data and prior findings.

And thus, we get self-perpetuating cycles of groupthink. And that, in turn, is why supposedly independent experiments cannot so easily be taken as independent points of evidence for an overall hypothesis.

[1] https://www.pas.rochester.edu/~sybenzvi/courses/phy403/2015s...

I'm curious about the ethical implications of your last point.

The maximally pessimistic view, which you and Feymnan seem to be espousing, is that people explicitly put "put their thumb on the scale" so that they get the right number. That's clearly bad.

The PDF you linked presents it as a more emergent phenomenon, driven by how people usually work. It's possibly an argument for working more slowly and carefully, or the use of pre-registration, but it seems ethically neutral.

Finally, you could think about this as a form of Bayesian updating, which each experiment nudges our previous best estimate of the value. Obviously, it would be better to do this formally, but it does seem more rational than completely discarding the past.

The PDF takes the point of view that it is caused by various forms of cognitive bias. I generally agree with that.

The reason that I don't think it's totally ethically neutral is that it is a basic responsibility of scientists to be on guard against cognitive biases to the best of their ability. It's possibly even the main feature that separates science from non-science.

Cognitive biases can become ethically bad particularly when they intersect with a person's personal interests. For example, if an investigator thinks "I won't be able to publish this result as easily if it diverges too much from the historical values, so I'll just run this experiment again", this is a problem. Even if it occurs totally subconsciously, it is a breach of duty because the scientist should take great care to avoid this kind of thing.

It could be viewed as Bayesian updating, yes. But my main point is that it greatly complicates the process of literature review and knowing how much certainty to assign to a scientific finding. If there are 10 papers saying X, but each one is highly dependent on the last, there is much less evidence for X than there appears to be, particularly to an outsider looking in.

> You take it on trust...

Is it trust? Or is it closer to "I'll scratch your back today, if you scratch my back tomorrow."? That is, if you blow the whistle (so to speak) on someone that's bad community karma, and that will come bak to haunt you.

That's not trust. That's a cartel.

It’s trust.

Specifically, you need to trust that when the authors claim to have reared mice in a high-oxygen environment, trained a monkey to move a joystick, or whatever the paper says, they did something like that. You can—-and should—-ask to see data demonstrating that they did it well, like oxygen levels in the mouse cage or trajectories produced by the monkey. However, unless those values are bizarre, it’s virtually impossible to know if they’re real or completely made up. Realistically, no one is going to fly you out so you can “audit” an experiment or record and review thousands of hours of surveillance footage.

When the act is mutually beneficial to both it's not trust. There is a clear incentive here for the reviewer to be less than thorough.

Put another way, the starting hypothesis sound be: this study is flawed. The reviewer should then approach it as such.

Not only isn't it trust. It's a violation of the scientific method. Yeah, sadly ironic (read: hypocritical).

> but also believe that being unable to reproduce a result can often be explained by their lack of sufficient information (or resources) to reproduce said results, rather than always assuming the other person must be lying (or just pretty darn lucky).

Most certainly, when you go to replicate something, and fail, your first assumption is that you did something wrong. Even the second or third time. But after enough tries, you start to look for another explanation. "Irreproducibility", as Nature and the common scientist would use the term, doesn't mean "I tried once and failed", it means "I tried everything I could possibly think of and it still doesn't work".

Also, for many types of experiment it is questionable whether they are reproducible even in principle. One of the common analyses I do is called RNA-seq. It quantifies, for each of ~25K genes, whether and how much it went up or down with a given perturbation.

So the result of such an experiment is a 25K long list with numbers attached to each gene. What would it mean to replicate such a result? Surely the list of significant genes and pathways will not be identical even if a robot were to perform exactly the same experiment, due to biological variability.

Some people find this property of nonfalsifiability to be convenient, however...

> "Irreproducibility", as Nature and the common scientist would use the term, doesn't mean "I tried once and failed", it means "I tried everything I could possibly think of and it still doesn't work".

I’m not a scientist, but it is surprising to me that the term would not be regarded as at least ambiguous. The article quotes a microbiologist who says “there is no consensus on what reproducibility is or should be.”

> But what really gets me is the disconnect between "most scientists agree there is a reproducibility crisis" and "most scientists believe most of the papers they read are basically true and reproducible". [...] I do not know how you square that circle.

If I was an academic, I imagine I would think:

1. Well, most of the papers I read are widely cited, reputable papers in widely read, reputable journals. A paper with 50 citations couldn't possibly be unreproducable!

2. Everyone knows there are scam journals and conferences that will accept everything. I'm sure it'd be easy to get a bad work accepted there, but I don't read anything like that.

3. And anyway, aren't the problems mostly in other fields? Everyone knows there are problems in comparative international underwater sports broadcasting studies, not serious subjects like mine.

I don’t think the public is going to cotton on. The incentives that created the reproducibility crisis (the application of demand-driven managerialism to education and research generally) have also compromised the public’s ability to observe and care about the problem. Changes to the funding model for basic research will continue to be driven by bureaucrats, not politicians responding to public pressure.
I agree with the last two paragraphs of your post viz. humility and reform.

> the disconnect between "most scientists agree there is a reproducibility crisis" and "most scientists believe most of the papers they read are basically true and reproducible"

Could it be that many scientists think: "there is a crisis, but not in my field!"? To be honest, I personally think this quite often (and then realize I am being naive).

> But what really gets me is the disconnect between "most scientists agree there is a reproducibility crisis" and "most scientists believe most of the papers they read are basically true and reproducible".

It sounds very much like an academic equivalent of Gell-Mann Amnesia - the phenomenon in which you notice that every news article on a topic in your field is complete garbage, people you know in other professions report the same situation with articles on topics in their fields, and yet when you turn the page and see an article on something outside your field, you forget about the whole thing and treat it as a gospel.

I find the science crisis even more worrying now that society and Internet spread the notion of "sourced everything or it's false".

if even 'scientist' themselves fail, then society is heading toward an absurdly frightening faux-intellectual inquisitive period.

Agreed! Seen with “Fact checking” websites aren’t about nuance, just a political position that argues “the part of the truth we want you to spread”.
I wouldn't conclude that much. I think they just believe that a paper is an absolute truth but most of them don't have enough knowledge about the ways and history of the scientific field. To me it's mostly an anxiety based reaction due to this era lack of 'promises' and emerging problems. People are running for certainty.
I would.

Fact checking, unfortunately, isn’t what we think it is. Despite the superficial appearance, fact checking isn’t a helpful tool for determining the truth and for forming an accurate opinion. Instead, it’s actually an in/out group filter which segregates people by belief and value, while allowing each group to believe they hold the Factual High-Ground, and to claim any subsequent moral position which proceeds from being “factually correct.

> Despite the superficial appearance, fact checking isn’t a helpful tool for determining the truth and for forming an accurate opinion

Sure it is. Not the bare conclusions viewed uncritically, but the support with references, if it exists (which in most notable fact checkers it does) absolutely is.

Isn't it the exact opposite? Presumably the papers scientists are reading are primarily the ones in their own field.
That depends on how critical you are. Our current incentive structure is out of control across the board and everything is skewing towards maximizing ROI in about every aspect of life money is involved in.

I apply this perspective to everything I see ("how are they getting money from this?") then work backwards and it seems to lead to accurate predictions, at least anecdotally. Perhaps I'm just jaded and cynical but it works well, unfortunately.

> Perhaps I'm just jaded and cynical but it works well, unfortunately.

It works well for me too.

I've learned to extend the "how are they getting money from this?" question into a generalized, first-principles thinking - look at the incentive structures and think about what they imply, about what's the expected behavior of a system running under those incentives. Because while not everything is about the money, systems will evolve over time along the lines of the incentives contained in them.

I'm arguing that the phenomena seem identical in their underlying structures. In both cases, we're dealing with a situation in which a person faces mountains of evidence that a source (a newspaper, or scientific papers in a given field) keeps pumping out inaccurate or wrong publications, but despite all that evidence, they assume that whatever isn't explicitly pointed out as wrong must be 100% true, accurate and honest.
Ah, yes. Wishful thinking.
Ideally, but once you are deep in a field what a field even is becomes murky. You have a biology paper you submit for review, but methodologically it's not really a biology paper, but maybe a statistics paper, or a computer science paper, or a theoretical math paper. Who reviews that paper, the person who knows the biology in question or the methods through and through? Sometimes the only person in the world who knows the theory and the technology in question the best is the author.
> people you know in other professions report the same situation with articles on topics in their fields

The speculation referred to as the "Gell-Mann Amnesia effect" starts with a subject identifying a low-quality article riddled with errors from a section of the newspaper within their area of expertise. They then turn the page to another section of the newspaper outside their area of expertise and "treat it as gospel" without thinking critically.

Maybe it's true, maybe it's not. But you are adding another part-- colleagues who use relevant expertise to inform the subject that the other sections of the newspaper are also low-quality and riddled with errors.

With that addendum I'm confident that I can now beat the house. I will take the bet against this modified Gell-Mann Amnesia effect for any amount the casino is willing to let me wager.

Edit: clarification

> One day the public is going to cotton on to all of this.

I mean, the flat-earth, anti-vax, climate-denier folks are well and cottoned. In the US, one of the major political parties what controls a fair amount of federal and state legislature seats is pretty sure 'scientists' are just liars and should be treated as such.

These chickens aren't going to come home to roost, they are roosting and their chicks are hatching already.

In a thread where the topic is “lots of science might be wrong” your take away is “doesn’t matter, everything is republican’s fault” and not “we should slow on using complex non-reproducible science to drive public policy because we know lots can’t be confirmed and there is no great way to tell what is definite“ ... well I suppose you demonstrated why it’s a “crisis”.
”One day the public is going to cotton on to all of this. I cringe every time I hear extremely authoritative lectures on "what scientists say" about highly politicized public policy matters....I'd exercise some humility.”

You shouldn’t be cringing. You should be educating people about how science actually works, and how it simply doesn’t matter very much whether any particular paper is reproducible. It’s a straw-man argument, because most papers aren’t worth reproducing. I don’t know many good scientists who take what they read (in any journal) at face value. If you do, you’ve been mislead somewhere during your training (fwiw, I also have a PhD). At best, papers are sources of ideas. Interesting ideas get tested again. Most get ignored. Even if a few hokum theories become popular for a while, eventually they’re revealed for what they are.

The tiny percentage of subjects that rise to the level of public policy discussion end up being so extensively investigated that reproduction of results is essentially guaranteed. And yeah, you hear lots of silly noise from university PR departments, but that stuff is a flash in the pan.

For example, nobody legitimate is in doubt of the broader facts of global climate change or evolution or vaccination, even if 95% of (say) social science reaults turn out to be complete bunk. Yet climate deniers, anti-vaxxers and “intelligent design” trolls absolutely love it when this distinction is ignored, because it allows them to confuse the public on the legitimacy of science as a process.

It's true that science doesn't value every published paper equally, but it's also true that publish or perish is creating ever-growing mountains of worthless papers. This is a real problem, and drags the quality of everything down.

Besides, the fact that there isn't one reputable journal in most fields that remains untarnished by the replication crisis is both a practical problem and a problem of public trust. A lot of this BS science is paid for directly by the public's tax dollars, or else by their student loans. I wouldn't expect the public to be so forgiving if 95% of it is bunk.

The public is oblivious at the moment because science still has credibility . It s going to take a major catastrophic misstep of science (which will inevitably happen the way things are going) for the public to lose trust
“it's also true that publish or perish is creating ever-growing mountains of worthless papers.”

Is this true? Prove your claim.

”This is a real problem, and drags the quality of everything down.“

Let’s assume your first assertion is true. Is it automatically true that your second claim follows? Why?

I see no evidence that the individual productivity of scientists has changed much in the last 30 years, nor do I notice much of a change in the aggregate quality of science. Crappy science existed hundreds of years ago, and it continues to exist today. The main difference, as far as I can tell, is that we have a lot more scientists now.

In any case, these are just assertions, not arguments.

You would think crappy science would go down with progress? I think not.

You should look for papers in psychology/sociology, AI (recommendation engines, accuracy), economics, nutrition and medicine. Marketing papers are also interesting, I guess.

As an anecdote, I dug into sexuality, gender papers recently and was baffled at the amount of shit I came across. I couldn't believe someone published it.

> Crappy science existed hundreds of years ago, and it continues to exist today. The main difference, as far as I can tell, is that we have a lot more scientists now

And ability to influence a lot more people faster than ever before with increasing level of distrust. Not to mention, papers from US and EU affects other nations perhaps more. Many people blindly piggyback due to not enough funding to replicate or perform our own analysis. Funding being scarce promotes hype and flock people towards whatever media popularizes.

It results in a very weird disconnection on topics that are dependent on population, history, culture, and other location sensitive data. The base is contaminated, anything built on it is not going to suddenly turn into truth.

There have always been weaker scientists, but there hasn't always been the economic incentive to publish in order to maintain a teaching position. This is a relatively recent (few decades) thing and is due to structural factors in academia and society. If you require proof, I'm not really sure what to say to you, as evidence is not hard to find, but if you don't already see it, I'm unlikely to change your mind.
If you want to claim that “publish or perish” (which, btw, has been a part of academic life essentially forever) is somehow recently affecting the volume of papers being produced, you should be able to provide evidence of that in a straightforward manner. One obvious test: is the per-capita rate of publication increasing? (my experience says “no”, but I’m open to contrary evidence.)

You have a hypothesis of what’s going on, but you’ve provided no evidence for that hypothesis, and when challenged to provide some, you tell other people it’s their job to do it for you.

It’s not my job to prove your extraordinary claims.

The number of publications per year per academic seems to me to have increased over the last 50 years. I don't have a citation.

Regardless, my original claim was that the absolute number of papers is growing, and most of them are trash. I think the sheer volume of trash has consequences that were not so serious 100 years ago, even if the percentage of trash was the same. I strongly suspect the percentage of trash has been going up, as well.

My argument is that "publish or perish" makes less and less sense the more active scientists and researchers there are, even if the average quality and the rate of publication per academic were constant, because the appetite and rate at which research can be assimilated by society is limited, and does not scale with population, while the number of scientists does.

I don't think these claims are extraordinary, and if you do, I'm not going to go looking for extraordinary evidence to try to convince you. I don't think I'm the only that sees these effects, however.

> One obvious test: is the per-capita rate of publication increasing?

It definitely is, in biology at least. My graduate mentor was really interested in publication metrics (as in, he published studies on them). The main driver is not necessarily crappy journals though, it is the increasing number of authors per paper.

I have no idea how you would evaluate something like "the average quality of papers is decreasing". I actually agree with GP that it is, but that's just, like, my opinion, man.

The number of citations/references has ballooned as well. So, if papers stand on more shoulders , most of which are crappy shoulders, they ll make crappy science
Reminds me of few antivaxers and trolls who grab those research papers to recruit people. If you see their arguments without contrary points, it's not impossible to fall for it.
> it simply doesn’t matter very much whether any particular paper is reproducible

If we are talking about highly abstract types of science, I agree entirely. The problem is that there are strong incentives for groupthink, even where public politics aren't involved.

For example, I'm involved in the aging field. One of the popular aging hypotheses was oxidative stress. Because of the number of scientists with careers invested in that hypothesis, research kept on for well over a decade after it was debunked. In fact, I work with a person who cowrote the paper that authoritatively debunked it over a decade ago, and that person still studies it and writes as if it were still true!

How much more so if the subject is politicized beyond a narrow community of scientists, then. I do not want to get into a political debate here, but evolution and vaccinations have over a century of scrutiny, whereas other fields do not.

Another example relatively close to my area is nutrition. Scientists have been totally convinced that fat is bad, sugar is bad, both are bad, all things are good in moderation... Even the public considers nutrition to be a joke for this reason. It is not enough for a community of scientists to agree on something, a fact needs time to "settle".

I agree totally with the "process vs individual paper" distinction, I would just propose the heuristic that "the more politicized the subject, the longer the process takes".

> I don’t know many good scientists who take what they read (in any journal) at face value

Sure, in journal club people tear apart papers. And maybe in private conversation. But then, on the other hand, look at the statistics in this survey. Or look at the way this same papers that they might privately pooh-pooh will be uncritically cited in a grant application or paper if it supports their hypothesis.

"The tiny percentage of subjects that rise to the level of public policy discussion end up being so extensively investigated that reproduction of results is essentially guaranteed."

I really don't think this is true.

Ideas are war in politics, and so the truth generally is the first casualty.

Golly I wish bureaucrats would pay more attention to the nuances of science.

Science is not politics, nor should it change in response to political forces.

The root problem is not that science has changed, but that you’re seeing political attacks on science.

> Golly I wish bureaucrats would pay more attention to the nuances of science.

Instead we get a PR campaign to put an self-described unstable 16yo who can “literally see invisible CO2” on the world stage frowning and yelling about stolen childhood.

I have no hopes for political reforms to actually look at nuance while this nonsense seems to work.

This rings true. But I think basic science is poor on interesting ideas. Thats due to the pursuit of ‘minimal publishable ideas’ and consequent lack of conviction/long term perseverance. It’s easier to follow the next trendy thing rather than exhausting a search space