Hacker News new | ask | show | jobs
by bearmf 5164 days ago
If it is that hard the bar is probably set too high. Most of the skills are learned on the job after all. Most smart PhDs who can program well and have sound knowledge of statistics can learn to do this stuff.
1 comments

Given enough time, anyone smart enough to finish a PhD can acquire a set of skills. :)

But it's more than just solid statistics. We're talking about having enough mathematical fluency to develop models rigorously (not just "oh, we'll minimize MSE!!"), test those models, then implement those models--possibly using a distributed algorithm.

From what I hear, these skills take years to develop. Choosing to groom the wrong person is an extremely costly mistake, so making the choice is difficult.

All mathematics consists of rigorous models. But choosing and tweaking a model is more of an art. Most data scientists apply existing models to new data, they do not develop new ones.

I am sure it takes much less than "years" for any smart PhD in applied mathematics to learn most of data analysis tricks. It is not theoretical physics after all.

Most data scientists apply existing models to new data, they do not develop new ones.

I meant "develop" in the software sense. Data scientists use off-the-shelf libraries during initial research, but those libraries usually lack an important feature preventing them from going into production (typically, no support for concurrency).

I am sure it takes much less than "years" ... to learn most of data analysis tricks.

I used to be cynical about "data science," too. After four months of working on a data science team, though, I'm a believer.

A data scientist is really a "full-stack data developer." He or she needs the ability to work with advanced models, use them to analyze large amounts of data, and modify those models to work concurrently or in a distributed system if desired (and its often desired). It's more than just "analysis tricks."