Hacker News new | ask | show | jobs
by Trombone12 1095 days ago
From the second link:

> students approached this “Year in School” question in a number of different ways. For example, a junior might have written “junior”, or “2016” or “class of 2016” or “3” (to signify that they are in their third year). All of these responses are reasonable.

> A less reasonable response is “Harvard” [...] Nevertheless, the data file indicates that 20 students did so. Moreover, and adding to the peculiarity, those students’ responses are all within 35 rows (450 through 484) of each other in the posted dataset

In addition, of these 20 very suspicious rows, most were strongly confirming the hypothesis of the authors.

Likewise the first link shows that, in a spreadsheet containing outcomes sorted by treatment group, someone had manually moved rows from the span of rows containing one kind of treatment to a span containing outcomes from a different treatment. These provably manually reordered rows also contained most of the strong evidence for the predicted effect...

3 comments

It all suggests that, in addition to falsifying data, they had become so blase about falsifying data that they weren't particularly careful about it. Which suggests that they may have been falsifying data for a long time...
Alternatively, they had such little practice that they weren't good at it.
Perhaps, but I think a first-timer would be able to avoid those mistakes, especially if they were thinking, "oh boy, this is a really bad and risky thing I am doing here, I better double and triple check". On the other hand, if it's a normal (for you) thing you're doing, which you have to do on most of your papers, then on some days you're going to rush it.
Normalized deviance, sure
From the first blog post linked above:

> We discovered evidence of fraud in papers spanning over a decade

> We believe that many more Gino-authored papers contain fake data. Perhaps dozens.

YOU SWITCHED THE SAMPLES.

Ladies and gentlemen, this is my friend, Doctor Kimble…

Not just most. All but one strongly supported hypothesis