Hacker News new | ask | show | jobs
by kasey_junk 1686 days ago
I don't know if my current company does, but when I first implemented them for a company I worked for ~15 years ago we definitely did.

At that company (which was a ~200 engineer, privately held, software company) we found a few things: - in person tests were less predictive than take home tests. - tests that did not provide automated test cases as examples were less predictive than those that did. - there was virtually no predictive power to 'secret test cases' that we ran without providing to the candidate. - no other part of the interview pipeline was predictive at all. Not whiteboarding, not presenting, not personality interviews, not culture fit testing, not credentials, or where experience came from, nothing. That was across all interviewers and candidates.

A few caveats about this: - this was before take home testing had become widespread and many companies screwed it up. At the time we were doing this it was seen as novel and interesting by candidates, not as just one more painful hoop they had to jump through. - we never interviewed enough candidates to get true statistical relevance. - false negatives were our biggest concern, they are extremely hard to measure (and potentially open yourself up to lawsuit). The best we ended up doing was opening up our pipeline to become less selective to account for it. This did not seem to reduce employee quality.

In a more meta-sense, that experience led me to believe that strict hiring pipelines are largely not useful. Bad candidates still get through and good candidates don't. Also, many other things have a much bigger outsized impact on productivity than if a candidate was 'good'. It turns out, humans do not produce at consistent levels all the time and things outside of what you can interview for make more impact (company process, employee health, life events, etc. all have way more impact on employee productivity than their 'score' at interview time).

6 comments

> no other part of the interview pipeline was predictive at all. Not whiteboarding, not presenting, not personality interviews, not culture fit testing, not credentials, or where experience came from, nothing

Did you test predictive power from individual interviewers? At a company I worked at previously we did, and this was the by far the best overall predictor: some interviewers just did a much better job at identifying those likely to succeed than others. Which can explain another reason why you didn't see much predictive power if you looked across those other items over all interviewers - the variance between interviewers essentially "swamps" any smaller differences between those interview techniques.

Note this didn't surprise me that much, as you see this dynamic in lots of other "person-to-person" endeavors. For example, when looking at whether one type of psychotherapy intervention is better than another, most of the data that I've seen shows that by far the most important factor is the skill and "match" between therapist and client, far more important than any individual modality.

We did. There were small differences between them, same with what questions they asked. But nowhere near as predictive as the code work. I suggested but was never able to get approved, removing interviews entirely.

Again, we didn’t have enough data points for real statistical validity so it could be that, but I became convinced that it didn’t matter who was interviewing or the format of the interview. Some candidates are good at interviewing and some aren’t but that didn’t hold to the job.

> there was virtually no predictive power to 'secret test cases' that we ran without providing to the candidate.

this brings back some unpleasant memories of a take-home i got from a FAANG.

basically i was given a loose spec to implement, with no real data or test cases (and was told that none would be provided when i asked). after submitting my work i received a terse rejection with 0 constructive feedback for my 6hrs of work. uncool.

In one interview, I was given a timed hacker rank problem with a screen share with 2 interviewers. The sample tests passed and the real tests passed except for 2 (from what I remember) that timed out on large data sets. Before the tests were run, I already highlighted the part of the code that's the bottleneck and asked if I could copy the code to Visual Studio (the test was in C#) because the standard lib has a data structure for this use case that I hadn't used in a long time but I couldn't get the code to compile on hacker rank. I wasn't allowed to use the IDE and I was also denied access to the standard lib documentation (in front of then through the screen share). I couldn't implement the data structure within the time limit. I failed the interview. I still wonder what the point of that test was.
I always feel like that type of coding interview is a sort of engineering hazing. I know I am often consulting documentation, especially when working in a new problem-space or less familiar programming language!

I always try to give candidates the benefit of doubt with silly things like syntax or whatever since it's not like I'm interviewing for a live coding performer!

Same experience. This is why I will no longer do take home tests that take more than 90 minutes or look like they'll take more than 90 minutes (even if the company misjudges it).

The only exception I've made is if the company pays for the time.

Fwiw that job we had an explicit goal of 60 minutes or less and tested that against engineers we’d already hired.

I’ve heard guidance that said up to 4 hours was a fine. That might have been true back then before employers abused the system and made code tests just another hoop, not a replacement for, interviewing.

I've recently encountered similar assessments. I asked for feedback or the test cases but got none. What do you think the best option is to learn from the projects?
post it on stack overflow or reddit for feedback :D

typically they tell you not to post your solutions publicly, though i don't know why you'd be inclined to respect their wishes after such an experience unless you're dying to work there in the future; the main thing i learned from the project is that i didn't.

The problem with keeping these stats is that it only tracks engineers that were hired. I don't think coding interviews are a good predictor of performance, and that's not why I use them.

The point of a coding interview is to eliminate, as fast as possible, people who simply can't code. I'm being completely serious here. They can even have a CS degree (or will claim to but if you look closely they were in an easier program to get into and took CS electives) but cannot write a simple program on the board in an hour.

It's also why I don't like take-homes. First it's trivial to cheat (I don't mean lookup stuff online, just flat out have someone else do the work) and because of that the final stage would still have to be in-person whiteboard (or pair programming over Slack but still have an engineer spend 40+ minutes with the candidate).

That was the purpose of the original fizzbuzz but for whatever reason it seems to have morphed into “spend all your spare time on leetcode so that you can answer whatever arbitrary problem is thrown at you” and they have the audacity to call that a meritocracy.
We used the same takehome for years, and eventually there were a few solutions online that were easy to find. But for some reason, they all sucked, so we never had to worry about unqualified candidates copying them.
The clear requirements of take home tests make them my favorite. They allow me to express how I work: get a list of reqs, walk away and think about them, make some decisions on directions, then let the code lead me.

I strike the style required. I capitalize on opportunities to make decisions I can discuss. "I used tape instead of jest because this example product will be distributed to many developers. The reduced API-surface area keeps us focused on the how's not the what's."

I tone that down if the role seems more rote-work like, at which point I try to highlight my ability to solve problems and learn quickly. For example, a comment above some network call: "// I was getting a cors error and found out I can run my own proxy for this"

Trouble is, unless more of the industry starts doing them so they're unavoidable, I'm going to skip companies that put these anywhere other than at the tail end of their process.

I'm not putting in half a day of work for zero pay to help you with your first-pass weed-out phase before we bother to make sure we align otherwise and this looks like a good fit. Thanks, bye, next (employer) candidate.

I like them because I have a far higher success rate with them. Take home tests cater to my strengths. Of course, I'm selective about the companies I apply for, and sometimes the test itself reveals something about the company, and I've rejected assignments after receiving them.
I agree and generally say I'm willing to an alternate, live coding approach instead. I'm not putting in hours and hours in some random take home that may or may not be discussed down the line. Been there and done that so many times. Most of the time it didn't even align with the job.
sweet, don't let the door hit you. More opportunities for me.
I think we should use interviews for basic screening purposes only and skip determining "who is awesome from just signals". Instead, shift to a flexible hours paid trial period where potential colleagues get a better assessment. Measure by doing and interacting around doing, not by guessing, hazing, trivia, interrogating, or whiteboard hand-waving.
The issue with paid trials are 2 fold:

If you are talking about short term trials many devs are bound by anti-moonlighting employment agreements that either outright bar working for someone else or require notification.

For long term trials you severely limit your hiring pool because that is effect temp-to-hire which many devs simply will not do.

The first issue could be fixed legally. Just like California makes non-compete in-enforceable, it could pass a law that says short-term moonlighting can’t be in employment agreements. This way, you could take a week off from your current job, and actually work for a week for an employer that you are interested in. Fully paid.
This would mean you would have to work a ton of extra, waste significantly more time than a day with the current interviews approaches, or be "interview hopping" with no steady job for an extended period if nobody hired you. Which could have gaps between "moonlight" sessions. Which could mean you end up broke.
I’m not a legal expert in this but am fairly certain it already is illegal in California.

I was not working in an environment that could only hire California developers. If I were I might more seriously consider the option.

That said, I’m guessing you’d still get people who would balk at moonlighting even if it was allowed.

This would maybe work if every employer did it and it was easy to pick up a new trial quickly, but the reality is that the time from application to hire can be weeks if not months at most companies!

No way I could risk having to find another job if the trial went poorly.

Why would I go for a job with a trail period when there are plenty others that give me certainty immediately?
Very interesting. Question:

How did you measure the candidate once hired?

What factors were indicative of a "good" hire vs. a "bad" hire?

We compared their performance review scores. I was always leery of this given how fraught performance reviews are, but that’s how the company judged employee ‘worth’ so it made sense to align there.
And just to pull on this thread a bit more ... what factored into your employer (or anyone reading my post) rating of an engineers performance?

Is it largely based on soft skills assessment. Or somehow code quality is being judge?

I honestly dont remember. It’s been a long time and I wasn’t privy to the assessments just the scores (and they were anonymized before I saw them). I do remember them being several 5 point scores.

It wouldn’t surprise me if they were just manager assessments.