|
|
|
|
|
by elliott34
3790 days ago
|
|
I think the best part of this article is the approach to learning new things he presents--going beneath the surface, learning the fundamentals below the “API” layer (abstractly speaking). Really great explanation of that insight! The part I am having trouble with, with ~5 years of real world data science experience in industry, is the implicit assumption that we all need or want to become good at deep learning. In my experience, most businesses, at best, are still struggling with overfitting logistic regression in Excel let alone implementing/integrating it with a production code base. And we all know that toy ML models that sit on laptops create ZERO value beyond fodder for Board presentations or moving the CMO's agenda forward. The fact of the matter is that the vast majority of businesses, with respect to statistics/ML, aren’t doing super duper basic shit (like a random forest microservice that scores some sort of transaction) that might increase some metric 10%. This is due to lack of sophisticated analytics infrastructure/bureaucracy/ lack of talent/ being too scared of statistics. Ultimately when you’re rolling out a machine learning product internally (I’m not talking about Azure/ other aws-model-training-as-a-service type things), the hardest part isn’t: “We need to increase our accuracy by 2% by using Restricted Convolutionallly Recurrent Bayesian Machines!” The hardest part is convincing people you need to integrate a new process into a “production” workflow, and then maintaining that process. |
|
Completely agreed. My timeline for a project usually goes like:
A. 2-4 weeks: deeply understand the problem, talk to stakeholders, gather requirements, plan out the project.
B. 2 weeks: explore the problem and the data. Build and tweak models, build a functional prototype.
C. 8-24 weeks: put the system into production on top of the companies' tech stack, either myself or working closely with engineers.
D. 4-12 weeks: sell the system internally, prove that it's a superior solution, get buy-in that it should replace existing processes.
So yeah, in a typical 6 month project I only spend about 5-10% of my time on actual data and modeling. This % has gone sharply down as my career has progressed.