| I'll start by saying that if you're going to roast a paper that an econ nobel winner and one of the most famous and respected working statisticians put their names on, you probably want to turn down the volume and double check your claims a little more before hitting "post." A z score is not at all morally equivalent to a p-value. It's just a standardized measure. Converting measures to z-scores aids in interpretation. They also can aid estimation in some cases: using non-standard parameterization in Bayesian analysis is often crucial to get MCMC to accurately sample from the posterior distribution. Sure, you can take a z score and look at the area under the curve and come up with a p value. But you don't have to. In the referenced paper, they use z scores to be able to standardize the measures in the papers they draw from, so they're comparable. The author's other critiques of the paper seem reasonable. It's a problem with all meta analyses: the amount of work it takes to correctly interpret publishes papers and then take those results and aggregate them is herculean. To do it to over 20,000 is inevitably going to lead to some mistakes. That said, those mistakes may not be fatal to the analysis. Moreover, saying "no one knows what happened to those 11,285 studies" without checking in with the authors is completely unfair. The first author responded with the code showing exactly how they achieved that figure. Nothing mysterious. Andrew Gelman responded in the comments to him, as did the first author. I find their responses convincing. |
But that is the whole point! The methodology of how they dropped the 11,285 studies was not even told in the original paper, and even in the comments the author doesn't explain "why", just "how". Hence, I think it's completely fair to call it "irreproducible".
The point of doing "reproducible science" is not that I write a paper, you email me asking how did I come up with my number, and I email you back an explanation. No! The important details should be in the paper already. You may do some magic on your dataset, and that's fair _as long as you detail what magic you did and why_, and given you can defend that practice in front of your peers. Otherwise what is the point of preaching "reproducible science" at all?