Hacker News new | ask | show | jobs
by kristjansson 2258 days ago
Teams being small, data being crummy, infra being hard, and yet expectations being high aren't so much complaints as the they are the job description.

The point of data scientists and the related roles listed in the article are not to just churn out the fun stuff, but to wade through the institutional and technical muck and mire it takes to bring the fun stuff to bear on a relevant business problem and to communicate the results in a way that people of all walks can understand.

2 comments

Yeah this guy seems to think Data Science work should be like doing a problem set for CS class. Sorry that you have to deal with messy data, fragile infra, and limited resources - I know it's not "fun", but frankly that's what the money is for.
That's the whole point of the article. Expectation (in this case, his own coming out of the bootcamp) vs. reality (what data science is actually like).
The author wants to be an MLE but doesn't know it.
As somebody in an ML Engineering role, i.e. somebody who could be asked to either fix the logging infrastructure or build some models, I would have agreed with this.

But even in this day and age with ML being the new hotness, you will find people who are quite happy to work on infrastructure and don't have a huge amount of interest in training models themselves, and it is probably a lot easier to hire them than people who can do both, and you may get better results from actual specialists.

I wrestle with this too, there's a lot of context to determine what skillset is better.

I suspect, if there are lots of relatively simple ML problems, then a generalist with integration chops will be more effective in getting them out quickly and "good enough". The specialist may take too long on models that are too heavy and impractical.

If there's one big ML problem (Google search, Netflix recommender, Amazon search, etc), where 1% additional makes a difference, then yes, specialist DS/modeler is probably preferred.

Larger, older org/heavier existing infra/more specialized culture will also tilt the scale towards specialists.

It's obviously a spectrum, but I feel like any org who is considering hiring a data scientist probably needs a data engineering team to begin with since you can do a lot of the analysis people want by just counting.

I also think it's unfair to specialists to say they will always overcomplicate things more than others, I've seen plenty of generalists with researcher envy do the same thing.