|
|
|
|
|
by gaius
3114 days ago
|
|
I've been seeing nothing but negative, dismissive comments about data science on HN lately, which is really disappointing. There's definitely a lot of hype right now about DL, but almost all of my job does not deal with Big Data or Deep Learning, 'just' machine learning + stats + calc + scripting + data cleaning + deploying models. But, all those things people did in the '90's or even earlier. It was called "data warehousing" or "decision support" back then. The fundamental techniques - linear regression, logistic regression, k-mean clustering - go back even earlier, to the OR community post-WW2. Banks have been doing credit scoring with these techniques for a loooong time. The manufacturing industry has been using these techniques for even longer. Engineering for even longer than that. So you can see why people are quite cynical about the way old, established techniques are being presented as the hot new thing - and you can see why people who have been doing this stuff for 20+ years might be annoyed at 20-somethings who claim to have invented this new thing. What's wrong with someone calling themselves a "statistician" or an "applied mathematician"? But this is by no means purely a DS thing, seems noone is a programmer anymore either, they're all "senior certified enterprise solution architects" or some grandiose thing. |
|
I would say data warehousing is more concerned with things like OLAP, Star Schema, ETL, etc. than what people are calling 'data science' right now. The same thing with 'decision support', since data warehousing grew out of decision support systems. The most overlap here is with 'data mining' algorithms like association rules clustering.
> The fundamental techniques - linear regression, logistic regression, k-mean clustering - go back even earlier, to the OR community post-WW2.
Here I think you've got a stronger argument. OR has a long, proud history of using applied math for business objectives. But again, I would say most of OR deals with different problems and different techniques - it's more about prescriptive analytics, constrained optimization, linear programming, simulations, etc. than the type of predictive modeling in most data science.
I see data science as a separate field even though it's stitched together from a bunch of others. It's certainly not entirely new, and certainly overhyped in some annoyingly-breathless news reports. I could say the same thing about CS - was it entirely "new" when it started as a discipline? Isn't CS "just" applied math?