Hacker News new | ask | show | jobs
by tgv 1833 days ago
And that's without knowing how the study was really executed. Where I worked, there was a relatively successful postdoc who presented the results of his p<.05 significant pilot study. When asked, he said it wasn't the first pilot. It was the 20th.
4 comments

In the department I worked for as a faculty member, there was an issue where the other areas would have graduation rates around 40-50%, as opposed to our area which was above 95% or so. This came up in the context of an external review.

When people asked around, informally what was said was that the grad students in the other areas (especially one area, in the experimental molecular biosciences) would leave after having to "redo" their dissertation over and over again. Essentially what would happen is they would propose a dissertation study, it would be approved by the area committee, the student would do the study, and it would produce null results. So they would be told to redo it a different way, or to pick a different topic, it would get approved, and the same process would happen again. After this happened a few times, with the student being told they had to produce significant results, the student would grow despondent and leave the program.

What's sad about this is that it's formally reinforcing p-hacking basically, as part of the degree program. But it's even more absurd than what's often alluded to in meta-science writings, because in these cases you would have a formal graduate committee, composed of faculty, deciding that the dissertation thesis is a good one -- that the hypothesis and design are solid, and formally approving the dissertation proposal -- and then because the results are null, it's unacceptable. If this was being done so casually in that forum, I can't imagine what goes on behind the scenes.

What makes that even more absurd in my books (apart from fostering an unhealthy academic culture) is that the purpose of a doctoral dissertation is essentially to show that you can do proper original research.

Getting a null result doesn't invalidate that in any way.

Exactly. In our area, the dissertation itself and defense were evaluated on the presentation of the research, not on the results, provided that they followed the proposed plan.

If you have a committee of experts who carefully evaluate a proposal and decide it's good, the results are as they are.

Broadening the discussion a bit, it seems one feature of science, as opposed to, say philosophy, is that the conclusions regarding a hypothesis are not knowable a priori. I think in contemporary academics there's some implicit idea that the quality of a researcher lies in their ability to identify hypotheses that are "correct", as opposed to simply following through with good but ultimately "incorrect" hypotheses. There's a bit of a roll of the dice involved with science; if there isn't, it's not science.

> it seems one feature of science, as opposed to, say philosophy, is that the conclusions regarding a hypothesis are not knowable a priori.

That should be a defining characteristic of any academic inquiry, regardless of whether it's science or not.

I have no training in non-quantitative fields, and my academic experience is from CS where the "science" part is often so-so. As such this should be taken partially as a layman view. However, my impression is that while in non-scientific academic fields the research isn't necessarily taking the form of explicit hypothesis testing, more or less similar criteria for intellectual inquiry should apply.

The research might be more about observation and critical (often non-quantitative and non-absolute) evaluation of arguments, and as such the validity of the methods (such as whether the hypothesis is assumed or genuinely questioned) might not always be as easy to judge [1].

The process might not be as easily formalized or judged as in science, but the mentality of critical inquiry should be similar. If the hypothesis is assumed and not questioned, that's no longer any kind of academic inquiry. It becomes politics, in the pejorative sense.

> I think in contemporary academics there's some implicit idea that the quality of a researcher lies in their ability to identify hypotheses that are "correct", as opposed to simply following through with good but ultimately "incorrect" hypotheses.

I think that's partially just psychology and human nature. We like results that make us directly know (or think we know) more, and results that basically tell us we still don't know less. Few people like uncertainty.

The society outside of the academia certainly values the former more than the latter, and funding and other external incentives probably exacerbate the underappreciation of negative results.

[1] Or perhaps it is, to an expert, but having that judgment would require the kind of experience in those fields that I don't have.

I have a similar horror story.

A medical student worked hard to analyze, say, 40 x-rays out of hundreds available. He found no significant evidence for some hypothesis. When he told his supervisor, the reply was: "Well then you should just analyze some more x-rays. I'm sure you'll have a statistically significant result at some point."

Which why P-scores need to be proportional to some nontrivial inverse factor of the number of experiments done in the field. Are there 10000 researchers doing 20000 experiments per year? Take 20000, multiply by the number of years we expect an academic to do hands on work (30?); invert: you need a P-score better than 1:600000.
This! Bonferroni correction across all conducted studies/analyses. I will suggest this next time I am reviewing a paper with crazy claims and shitty statistics.
I know this is mostly a joke, but what you really want is Benjamini–Hochberg correction, unless you want to prevent even a single false discovery in all of science. FDR vs FWER
Somehow one of the 3 experimental groups has 16 people, whereas the other groups have 14 people. Wonder why...
For what is worth studies do end up with different amounts of people due to dropping out, people not matching criteria which are only checked later, indivisibility by number of groups etc.
Yes, there are legit reasons for this to happen, as well as cherry-picking.