| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by FabHK 1132 days ago

No. They use 400 known fakes and 400 matched (presumed) non-fakes to estimate the sensitivity and specificity of their indicator, then apply that indicator to the full universe, then employ the estimated sensitivity and specificity to the obtained measurement to estimate the approximate actual rate of false papers.

If you know the true prevalence of a disease in a population, and the sensitivity and specificity of your test, you can predict how many positive measurements you obtain. Vice versa, from the (flawed raw) measurement, given sensitivity and specificity, you can estimate the true prevalence.

Furthermore, they’re explicitly saying that “red flagging” by their simple indicator doesn’t mean that the paper is fake, but that it merits higher scrutiny.

ETA: I mean, it could still all be bullshit (by virtue of some bias or so), but you’ll need to argue a bit harder to establish that.

ETA2: Actually, not sure that’s what they’ve done. They might have just reported the raw (very bad) measurement (that they call “potential red flagged fake paper”), without doing the obvious next step outlined above, and without applying any confidence intervals. So, it might actually be a pretty crap paper (though possibly technically correct) coupled with some mediocre reporting layered on top. Isn’t basic statistics taught anymore?

5 comments

steppi 1132 days ago

I've worked on research estimating prevalence from imperfect tests, and something that concerns me about this study is that they aren't showing the error bars for their estimates. Typically, you would report a confidence interval for prevalence rather than just a point estimate, and the confidence intervals can often be fairly wide. There's two sources of uncertainty here, the assumed probabilistic nature of the diagnostic test, and uncertainty in our estimates of the sensitivity and specificity.

I think this paper by Peter J Diggle [0], gives a solid methodology. Instead of treating sensitivity and specificity as fixed values using sample estimates, you can model them as each having a beta distribution. In this case these beta distributions can be found using a Bayesian treatment of Bernoulli trials.

[0] https://www.hindawi.com/journals/eri/2011/608719/

link

steppi 1132 days ago

Amazing. Reading more carefully, as FabHK pointed out above, they aren't even applying the obvious correction. They're just reporting the positive rate of the imperfect test. I've implemented Diggle's method [0]. When I have time, I'll see if they've provided enough data to do a proper analysis, and maybe write a blog post about it or something.

[0] https://github.com/indralab/opaque/blob/761572ed1b0d601271f0...

link

robocat 1132 days ago

> they aren't showing the error bars

Perhaps any paper without error bars should be tagged as a fake paper.

This one would have sneaked past though: https://retractionwatch.com/2022/12/05/a-paper-used-capital-...

link

newswasboring 1132 days ago

> Furthermore, they’re explicitly saying that “red flagging” by their simple indicator doesn’t mean that the paper is fake, but that it merits higher scrutiny.

Then they and science should change their sensationalist headline. It's ironic that a paper about fakeness of something uses a borderline misleading title.

link

danhau 1132 days ago

You’re not wrong, but it is everyone’s own responsibility to read the article and not just the headline.

link

newswasboring 1132 days ago

So it's ok to lie in a portion of your work? Where do you draw the line? I draw it when someone starts communicating. Being wrong is ok, being deceitful isn't.

link

caddemon 1132 days ago

Is this headline really deceitful though? Certainly the research is flawed, but the statement "[bad thing] is alarmingly common" is basically just a subjective statement that lets you know what position the author is going to argue.

link

newswasboring 1131 days ago

I will never understand why everyone bends over backwards to justify lazy af journalism. This a magazine which is supposed to do scientific journalism, yet it didn't even mention the points that readers in HN comments were able to figure out on a cursory look. Peer review isn't just the 3 reviewers who accept or reject something in a journal. It's everyone in the scientific community.

link

ouid 1132 days ago

Responsibility is not conserved in a robust system. This is true and it is also the journal's responsibility to not mislead.

link

Retric 1132 days ago

Expecting people to read every single article posted to HN is unrealistic.

Simply reading a title and on a topic you don’t find interesting then gives people the wrong impression.

link

Retric 1132 days ago

You can’t directly calculate both sensitivity and specificity using equal numbers of positives and negatives groups unless the actual population has that ratio.

A completely random test given equal populations results in 50% accuracy and 50% specificity. Things don’t look nearly as good if only 1% of the actual population has the condition.

link

tgv 1132 days ago

Their baseline had better be representative.

link

marcosdumay 1132 days ago

So, in other words, the signal they get from it is around 70% of the noise, but it's ok because you can indeed do that with good enough statistics?

They better have a flawless methodology, because any tiny problem is enough to ruin their analysis. And well, just flagging almost any paper not from the EU or US as fraud doesn't usually come together with a flawless methodology.

link