Hacker News new | ask | show | jobs
by mumblemumble 972 days ago
I would argue it's more a blind spot of big data, which tends to tacitly imply just doing correlational studies on data that happens to be laying around.

Most data scientists work for companies that don't really want to pay for controlled experiments outside of maybe letting the UI team do A/B tests. Natural experiments can be hard to come by in a business setting. And all of the wild mathematical gyrations that econometricians and political scientists have developed to try to do causal inference from correlational data have a tendency not to be as popular in business because, outside of some special domains such as politics and consumer finance, it can be rather difficult to get away with dressing your emperor in math that nobody can understand instead of actual clothing.

1 comments

Exactly. This is the primary difference between observational and experimental studies (controlled experiments). Experimental studies control for the hypothesized mechanism as part of the experimental design, but observation studies do not or often cannot. Good data from controlled experiments is difficult, costly, and time-consuming to generate in general, and that often does not mesh with the notion of "big data". I think we are running into this problem more and more as we discover that our data sets really are superficial --- collections of a lot of data that is easy to collect rather than a representative sample of everything (especially in a controlled manner). Good data isn't cheap.