|
|
|
|
|
by nmkridler
3741 days ago
|
|
I completely disagree, data scientists who can not create the data they need are at a significant disadvantage to those who can. Our job is more than being able to analyze and interpret data. If you have someone in your organization that spends no time thinking about how they get the data, you need to fire them or reduce their salary. |
|
The data we use comes from relational databases and document stores operated by different departments, external APIs and third party services, SalesForce, server log files, etc. A stats PhD does not have the training to gather this data themselves.
In terms of a hybrid scientist/engineer role, I don't know many software engineers who are also good at stochastic calculus or ensemble learning. Likewise, I don't know many data scientists who are also comfortable writing cronjobs to retrieve external API data or have the ability to diagnose server problems.