Hacker News new | ask | show | jobs
by ims 2109 days ago
There were some stunning claims being made on Twitter last month based on a recently published study. Instantly skeptical, I dug into the methodology section and found this gem:

"It should be noted that the results cannot be estimated using a physician fixed effect due to a numeric overflow problem in Stata 15 which cannot be overcome without changing the assumptions of the logit model."

... The sad part was they didn't even choose a reasonable model in the first place.

3 comments

(Ignore my previous reply I found it myself). To be fair to the authors, it is not their primary specification, that was a linear probability model. The logit model is just a robustness check to make sure the linearity assumption isn't driving the results.
Yes, the primary specification was a linear probability model for the likelihood of a binary dependent variable conditioned on two binary input variables. As far as I could tell, the fit was max likelihood without regularization and the paper's bombshell conclusion was based on the regression coefficients' p-values.

The Stata thing was just one of many, many red flags.

When your robustness check fails because of numerical overflow...
I mean they couldn’t just do it in Matlab? or Python? So incredibly lazy.
Python, R, Mathab etc. are outside of many people's skillsets. I've tried to evangelize to many fellow researchers, but they simply don't have the time or the interest in novel programming languages when the tools they have (largely) work.
Matlab is quote expensive.
Which paper was this?