Hacker News new | ask | show | jobs
by peatmoss 2957 days ago
My goodness, are you me? I've been having exactly the same thoughts. Provided one finds a data science role that roughly aligns with your training / interests, the actual work is comically easy compared to what you go through in academia.

A boot camp can easily teach someone to, say, estimate a linear model or run k-means. I dread the future when the industry decides the right way to put up barriers is by creating ever less-realistic interview loops that are even more coin-flippier, dice-rollier, card-shufflier.

1 comments

A boot camp probably won't tell the process to make sure a linear model is the right choice. It probably also won't tell you when you should and shouldn't use k-means. And in most cases you probably won't have very good answers as to the certainty of your models.

Imagine you're hiring someone to build a house for you. Would you feel comfortable with someone who's just been drilled on how to use individual tools? I would want someone who had been taught a step by step process for how to put together a house.

> A boot camp probably won't tell the process to make sure a linear model is the right choice.

Very true. On the other hand, it's a pleasant rarity when I see positions that appear to index more heavily on, "how well is this person able to conceptualize the problem and choose an appropriate method?" than "can this person do X?"

Lots of folks can do X; fewer can conceptualize a research question and choose the appropriate X; even fewer can carry out the X and communicate robustly what it means.

The latter two start to get into squishy territory, but also are where the value is. They also seem to get the least focus in advertising / recruiting / interviewing data scientists.

It reminds me of studying evaluation methods in planning. One that people are really familiar with (at least anecdotally) is cost-benefit analysis. Conceptually, it's very simple. The problem is that the costs and benefits that are hardest to measure are very often NOT measured. And they're very often the sorts of things that people find the most important. So, you end up with an answer that encodes a ratio of easily measured things rather than important things.

So too with data science. Easier to check whether someone can remember basic probability rules and carry out a linear regression than it is to diagnose whether someone can reason carefully about an amorphous business problem.