|
|
|
|
|
by parpfish
406 days ago
|
|
I was heavily encouraged to do what would later be called “p-hacking”, but it looked different from what they describe here. This article describes p-hacks for people that aren’t into math/stats. I always ended up p hacking because I was into stats methods. Somebody would say “here’s an old dataset that didn’t work out, I bet you can use one of those new stats methods you’re always reading about to find a cool effect!”, and then the fishing expedition takes off. A couple weeks later you show off some cool effects that your new cutting edge results were able to extract from an old, useless dataset. But instead of saying “that’s good pilot data, let’s see if it holds up with a new experiment”, you’re told “you can publish that! Keep this up and maybe you’ll be lucky enough to get a job someday!” |
|
Normally when doing that you need a multiple comparison corrections and conservative stats. That won't get you published though, or if you do get published you won't get noticed except by someone running a meta analysis. Perhaps not even then. Usually you end up with negative results from reanalysis, evidence of tampering or small effect sizes.
And this does not that reliably detect dataset manipulation, p hacking on the part of experimenters or accidental violations of the protocol, not even necessarily if the data collection included measures to prevent it.
In short: you cannot 100% trust any dataset you did not make. Not even as part of the team that makes it.