Hacker News new | ask | show | jobs
by fonnesbeck 3533 days ago
The statement "More wisdom potentially gets extracted when you apply Statistics to more (and better) data, but the analysis itself doesn’t improve with better data." simply isn't true. A hierarchical model, for example, is increasingly able to model subgroups and additional levels of hierarchy as more data are added. Penalized regression (or Bayesian regression) is another example -- the model is structurally different as you change the quantity of data.

The difference between ML and statistics is entirely semantic. Is logistic regression a ML method or a statistical method? It is both!

1 comments

Machine Learning is generically defined as a method of data analysis that automates analytical model building.

That's the part that's going to have the impact---automated improvement of the way Statistics is applied to data analysis.

Is that not major enough to be considered and discussed separately?

I think the kicker is that most machine learning is incrementing on a single model. Typically one known from statistics before. The weak learner track of combining models almost guess against this, but I think even then it is usually the same shape of model.

So, I actually agree to an extent. Much as computers can be seen as the "next logic". Only, it is such a "builds on" relationship that I think calling it next is dubious.