|
|
|
|
|
by mrwebmaster
2640 days ago
|
|
There are some discussions on whether data scientists are going to be replaced by automatic tools in the near future. Can this be considered an example of a tool that partially replaces the work done by a data scientist? At least it can save a lot of time. |
|
Creating simple predictive models where your problem is already easily narrowed down to a "given x predict y" definition is pretty trivial. Having it automated is nice, but not exactly a hard thing to do.
Genuine question: how many people have jobs where those kinds of problems form any significant part of their workload?
I also often see a response to this sentiment along the lines of, "Yeah, but there's also data cleaning..." etc. My reaction to this is mixed. I mean, sure, there is also data cleaning involved, but is this really where people spend most of their time?
My team spends most of our time doing the following:
1. Formulating problems. Figuring out the various different ways that a real-world problem can be expressed mathematically and feasibly attacked computationally.
2. Engineering software to implement the solutions to these problems, sometimes using some of the (amazing) frameworks out there for ML or probabilistic programming, but often having to develop our own approaches from scratch.
3. Doing all the management, stakeholder relationship stuff, business cases, etc. that make your work relevant and possible.
4. Getting data. Always an issue.
I'm very genuine in my curiosity here: are we total snowflakes, and most data scientists spend their time cleaning data and building "given X predict y" models?