Hacker News new | ask | show | jobs
by tfehring 2484 days ago
I think some candidates would be open to performing a half-day exercise. But the best candidates wouldn't, which is what drives the anti-selection I mentioned in my previous comment. More broadly, I don't think it's realistic to create an assessment that's representative of real-world data science workflows without being onerous enough to exclude good candidates.

If representative isn't an option, highly correlated is the next best thing. In practice, for my team specifically, this means screening for math aptitude and general business acumen during a phone screen, data manipulation (moderately complex SQL + tidyverse/data.table/pandas) during a "take-home", and delving more into problem solving approach, model selection and validation, etc. during an onsite. Broad business questions (e.g., "How does a life insurance company make money?") and communication skills generally weed out the candidates who picked up the bare minimum math and programming background through Kaggle + MOOCs.

As an aside, I absolutely think that the sort of assessment in the OP kills creativity. I care a lot about whether a candidate would think to include covariates like Internet usage and segmented urban population when predicting mortality rates; I don't care at all whether they're able to write the trivial amount of code that's needed to include those covariates in a model, given a data set that already contains them.