Hacker News new | ask | show | jobs
by thisiswrongggg 1397 days ago
From the paper you link:

"This article summarizes the practical and theoretical implications of 85 years of research in personnel selection. On the basis of meta-analytic findings, this article presents the validity of 19 selection procedures for predicting job performance and training performance and the validity of paired combinations of general mental ability (GMA) and the 18 other selection procedures. Overall, the 3 combinations with the highest multivariate validity and utility for job performance were GMA plus a work sample test (mean validity of .63), GMA plus an integrity test (mean validity of .65), and GMA plus a structured interview (mean validity of .63)"

1. The research is dated (1998). Long before many of the current best practices in SW Eng were established.

2. Says it is based on 85 years of research. Obviously not IT-related then.

3. Even if we get past that it gives 3 almost equally good methods of hiring where the highest one is GMA and integrity test - not work sample test.

4. Even if we get past all that work sample means work sample. It does not have to be produced under pressure in a weekend as unpaid work which as any professional knows is very hard to bring yourself to do right (being a professional means getting paid for my services as I live from selling them). It can very well be some past work on github etc.

So, unless there is a better/more focused on IT/up to date research proving that take home tests (which btw can be offshored/gamed very easily) lead to better hires I remain highly skeptical of all that and big fan of whiteboard/pair programming.

1 comments

Re: 3 - if you read the full paper you'll see that on its own a work sample test has (barely) the highest predictive value (but a lower confidence in that than GMA which is more heavily studied). This quote I think does more to demonstrate that GMA and integrity are less correlated than GMA and work product testing or structured interviews, which is intuitive.

Re: 4 - I'm not sure why you would consider a take home test less valid as a work sample test and then prefer a whiteboard test. Certainly the latter is less representative of real working conditions. (I probably would not want to work in a place where whiteboarding is more representative, at least!)

Your other two points are reasonable threats to validity but I don't think especially strong ones. The research covers many different professions so I think the onus would be on you to explain why software engineering is so different.

Yeah I think #1 and #2 are valid criticisms. My issue with the parent comment's line of reasoning is that it doesn't apply the same level of rigor to evaluating whiteboarding as it does to work sample tests. Our goal should be to identify which evaluation approach is most predictive based on the evidence available.