| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cgiles 2398 days ago

That is correct. Even when reviewing papers, it is true. As a reviewer, you do not question, for instance, whether the authors did the experiments exactly as stated, or whether they tried to analyze the results 20 different ways until they found the way that looked best.

You take it on trust that they did these things correctly, and focus on whether their conclusions are justified from their data.

If reviewers take so much on trust, how much more so readers, then? There are very, very few actual standards of technical proof that are in play here.

Particularly when a paper says something that isn't particularly novel. If the results match "prior probabilities" from the literature, the paper will be believed without much question. If it doesn't, it will get more scrutiny. People quickly learn that it is easier to publish when your result fits the status quo.

Take a look at slide 10 of [1]. There are numerous examples like this where physical constants were first measured as a value, and gradually trended up over a long period of time towards today's "true" value. If the experiments were really independent, they would, generally, scatter randomly before converging on the true value. The fact that they did not, suggests investigators were using methods to "smooth" the difference between their data and prior findings.

And thus, we get self-perpetuating cycles of groupthink. And that, in turn, is why supposedly independent experiments cannot so easily be taken as independent points of evidence for an overall hypothesis.

[1] https://www.pas.rochester.edu/~sybenzvi/courses/phy403/2015s...

2 comments

mattkrause 2397 days ago

I'm curious about the ethical implications of your last point.

The maximally pessimistic view, which you and Feymnan seem to be espousing, is that people explicitly put "put their thumb on the scale" so that they get the right number. That's clearly bad.

The PDF you linked presents it as a more emergent phenomenon, driven by how people usually work. It's possibly an argument for working more slowly and carefully, or the use of pre-registration, but it seems ethically neutral.

Finally, you could think about this as a form of Bayesian updating, which each experiment nudges our previous best estimate of the value. Obviously, it would be better to do this formally, but it does seem more rational than completely discarding the past.

link

cgiles 2394 days ago

The PDF takes the point of view that it is caused by various forms of cognitive bias. I generally agree with that.

The reason that I don't think it's totally ethically neutral is that it is a basic responsibility of scientists to be on guard against cognitive biases to the best of their ability. It's possibly even the main feature that separates science from non-science.

Cognitive biases can become ethically bad particularly when they intersect with a person's personal interests. For example, if an investigator thinks "I won't be able to publish this result as easily if it diverges too much from the historical values, so I'll just run this experiment again", this is a problem. Even if it occurs totally subconsciously, it is a breach of duty because the scientist should take great care to avoid this kind of thing.

It could be viewed as Bayesian updating, yes. But my main point is that it greatly complicates the process of literature review and knowing how much certainty to assign to a scientific finding. If there are 10 papers saying X, but each one is highly dependent on the last, there is much less evidence for X than there appears to be, particularly to an outsider looking in.

link

chiefalchemist 2398 days ago

> You take it on trust...

Is it trust? Or is it closer to "I'll scratch your back today, if you scratch my back tomorrow."? That is, if you blow the whistle (so to speak) on someone that's bad community karma, and that will come bak to haunt you.

That's not trust. That's a cartel.

link

mattkrause 2398 days ago

It’s trust.

Specifically, you need to trust that when the authors claim to have reared mice in a high-oxygen environment, trained a monkey to move a joystick, or whatever the paper says, they did something like that. You can—-and should—-ask to see data demonstrating that they did it well, like oxygen levels in the mouse cage or trajectories produced by the monkey. However, unless those values are bizarre, it’s virtually impossible to know if they’re real or completely made up. Realistically, no one is going to fly you out so you can “audit” an experiment or record and review thousands of hours of surveillance footage.

link

chiefalchemist 2397 days ago

When the act is mutually beneficial to both it's not trust. There is a clear incentive here for the reviewer to be less than thorough.

Put another way, the starting hypothesis sound be: this study is flawed. The reviewer should then approach it as such.

Not only isn't it trust. It's a violation of the scientific method. Yeah, sadly ironic (read: hypocritical).

link

mattkrause 2397 days ago

I’m not saying you shouldn’t be skeptical. The reviewers’ job is to ask whether the approach used and data collected make sense and if so, whether and how well they support the authors’ claims.

At the same time, peer review is meant to be an advisory process, not an adversarial one. You should certainly flag things that seem problematic, and ideally offer solutions, but you’re not expected to—-and cannot, really—-tear down the experiment and root out every possible mistake or malfeasance. All you’ve got is a day or so and a 6000 word description of the project, so if the manuscript claims that the mouse weighed 12.3 grams, you’ve basically got to take that number at face value. Trust might be too strong of a word—-you could certainly question a weight of 123 grams, which is improbably large—-but you should at least start in equipoise with respect to some stuff.

I also don’t see the mutual benefit, beyond nebulous things like enjoying consensus. A short review is also easier to write than a long one, but it’s just as easy to be dismissive as uncritical. Reviews are usually unsigned—-and the authors and reviewers may not even know each other[0] so it’s hard for there to be an explicit quid pro quo.

[0] Of the stuff I reviewed in 2019, I knew exactly one of the authors, and here ‘knew’ means ‘Once amiably chatted with them in a coffee line at a conference’.

link

SlowRobotAhead 2397 days ago

Asking “does this make sense” when it’s monkey trained to use joystick is something very different than trying to question and confirm “our model which is entirely based on these 40 other models which were all results of estimations of samples taken over the last 100 years then extrapolated out”... you can ask to see the monkey using the joystick.

The model verification would be years of work - so everyone just “trusts” it’s right, although we basically know it’s not because the chance it isn’t is just much more likely.

It doesn’t have to be a quid pro quo any more than it has to be “ugh, that sounds really boring to confirm, I’ll just believe it”.

link

cgiles 2397 days ago

When you review a paper, no one knows who you are except the editor of the journal. If there are any incentives at all for the reviewer, it is not to be arsed to review at all, since it takes time and you get nothing out of it.

There is some truth to the idea that some reviewers are lazy and don't bother to examine the paper as well as they should because that takes time. But when they do that, it irritates the editor, who is trying to make an informed decision on the paper, so they get bad karma for that.

If you want to be cynical and look for community politics, you should be directing your wrath at study sections / grant review panels. That is a totally different ball game and there is a fair amount of corruption there. Peer review may be imperfect and prone to some cognitive biases, but it is not generally corrupt.

link