|
>If data science just becomes a code word for brogramming your way through a set of black-box ML algorithms, then I will welcome the inevitable crash of data science. A fundamental challenge I see here is how bottom-heavy data science feels now. There are tons of people out there trying to "get into data science" from other fields, but the number of people with substantive domain knowledge, strong programming skills, and the math background to be able to understand the ML black boxes is quite small relative to the number of people calling themselves data scientists. In other words, real insight definitely is (or should be) the goal, but real insight is really hard, and scikit-learn is so easy. My hope is that this improves over the next 5-10 years - the more mature data science becomes as a discipline/career, the better the education will be and the more experienced people there will be. There is a risk in the mean time, though, that a flood of relatively inexperienced people causes a collapse in expectations for data science, making businesses less eager to hire them in the future. |
Furthermore, there's a number of practitioners that expect their data to be ready for them in some perfect state. Probably a majority of the task is create a pipeline for acquiring data and labeling it appropriately if necessary, which may require developing some ontology or classification with rigid guidelines such that someone in India can delegate the task to a large team. Then the practitioner spends an inordinate time optimizing some heuristic that has a meaning that drifts over time, or is completely inconsistent with the goals of the product. These are both problems outside the realm of domain knowledge or experience.