|
|
|
|
|
by stevehiehn
3472 days ago
|
|
After a year of experiments i realized that machine learning and big data pipelines are inseparable. So at first i was thinking R/Python is the greatest. And it might be until you need to do more that a few isolated models. At that point i reverted to building the pipeline parts with Spring Java + InMemory DataGrids because there is so many options. |
|
I believe the majority of data science jobs today are involved on doing only the first (to gather pontual insights) and dropping the ball on the second since it involves a lot more software engineering, and those jobs are currently being fulfilled by those without this skill.
I foresee this being a source of frustration in the next years for companies that fell for the data science hype, once they figure out it takes significant investment and commitment to build intelligence into their systems, or even curate high-quality data to do it right in the first place.