Hacker News new | ask | show | jobs
Ask HN: Can you use AI/ML to match software engineering resumes with jobs?
14 points by pradeep_m 3336 days ago
There are quite a few startups that purport to match candidates with jobs based on their resumes?(Linkedin, Hired, etc). None of them seem to work very well.

The current state of the hiring process is to have recruiters curate the set of resumes that have applied for a job and then have the candidates go through an interview process (typically 1-2 phone screens + 4-5 onsite interviews). Most companies in the bay area have about a 5% conversion rate (Phone screen to offer).

Do you believe there is some latent information at the top of the funnel that can be added to make hiring more efficient? Can we increase the conversion rate beyond 5%?

Can the resume screening process by a recruiter be automated? If not, why not? It can be argued that we can build algorithms that remove bias and have a better conversion rate than 5%.

7 comments

This is just the tip of the iceberg my friend :) the small conversion rate is mostly due to a broken interview process, not a wrong matching mechanism. Nowadays you'll often end up with this kind of emails after spending weeks interviewing for a job:

"Unfortunately, we think you're not a perfect match for this position, blabla..."

You are a perfect match because you're applying for something you do everyday at a different company. You might also have an impressive github page or even products of your own out in the wild. Problem, you couldn't crack a leetcode-like stupid problem on a whiteboard during a 30min timeframe. So they think you're not qualified for a Frontend dev or mobile or backend job. Again, you have years of experience in the domain and a huge track record.

One single algorithm problem on a whiteboard is all they need to make a decision nowadays. The evaluation process is completely broken. Does it mean we all know how to build products and the goal is to actually separate the best from the very best? Well, if I'm fresh out of college, I should crack your leetcode problem in a minute because that's exactly what I've done for the last few months getting ready for my final CS exam. If I'm a lead engineer who deals with complex architecture and people problems everyday at work while building products for the world, I am miles away from school. Reading the job description they're looking for a lead web engineer. Going throught their interview process, %90 of the evaluation revolves around writting pseudo code on a whiteboard for hours. Talking about trees and linkedlists. Do you see the problem? They are looking for an experienced engineer with a huge track record, but they use college stuff to evaluate their candidates. If you're fresh out of school you could crack the interview but you're not even qualified and vice versa. That is the broken piece.

I think companies are moving towards eliminating the Resume. All they need is to find someone who can crack a stupid question, typically what google did when they first started hiring a lot of people in early 2000. Today, everyone is going back to the same process of hiring "generalists" whatever the heck that means. We need to stop that to improve your conversion rate.

I think this is where the perceived "ageism" comes in...
Matching based on keywords is already common.

Success in a job is determined far more by interpersonal dynamics and history than a list of buzzwords or "skill sets" that match. Hard to express those things in a resume or CV, harder to detect and measure them because they are human qualities. That's what's supposed to happen in interviews and internships.

There are a few problems with resumes:

- Job candidates are often not good at building resumes that market themselves to a specific role.

- Job candidates have an incentive to write the boldest resume, while still being technically honest.

- Resumes are typically 1 or 2 pages and are missing a lot of data about work histories that span years to decades.

On the other hand, the academic world has a similar, but much more interesting dataset:

- There is a very narrow culture that defines what a good CV is.

- CVs are filled with verifiable accomplishments like publications. The existence of the publication can be verified with a Google search, and a publication's usefulness can be approximated by the number of citations proportional to similar papers.

- CVs are much longer than an industry resume, giving more data for an algorithm to parse.

I think there is potential for academic jobs to programmatically find candidates based on analyzing CVs and publications for qualified job candidates... but the academic world is also small enough that automation may not be necessary.

For industry jobs, I think programmatically-administered work-sample tests are the future, even if candidates hate them. Senior candidates have the bargaining power to avoid work-sample tests, but everybody starts off as a junior candidate. Everybody may end up taking work sample tests for their first job in industry--much like (nearly) everybody takes a standardized test for undergraduate college admissions.

I think from the ML point of view it would be hard because of the amount of data you need to train an algorithm. ML algorithms in general need a really big amount of data to predict good results. If you have let's say a thousand of cases you know if the person got the offer or not, it still wouldn't be enough to train an algorithm.
Couldn't we train it on top of resumes itself? With every resume, you know where they worked before and you have the data of where they worked until then. Granted, it would be muddled with noisy data (like skills, etc that they didn't have in the past). It wouldn't be very hard to collect enough resumes as training data.
The conversion rate is so low probably because a resume contains only about 5% of the information needed to determine whether a potential candidate is a good fit for the job (not to mention the information the candidate needs to determine whether a job is a good fit for them).

Screening resumes is the easy and quick part of the hiring process. By hand I can go through a week's worth of applicants in an hour. Sure, automation would save me a little time if I trust it to do as well as myself. But that hour to go through a hundred resumes pales in comparison to the fifty hours involved in getting a dozen resume-screened candidates through our interview process.

It already is largely automated. The vast majority of resumes received by large companies will never be read by a human.

https://www.themuse.com/advice/beat-the-robots-how-to-get-yo...

The conversion rate will tend to be low in any process that is designed to protect against false positives. Protecting against false positives is generally the preferred process for early stage companies and late stage companies (e.g. small startups and Google/Facebook etc.) A company that is staffing up fast can often exchange more churn for expansion, in part because even with a selective process the rapid change induced by expansion will cause churn organically as people's roles and responsibility evolve and the hierarchy adds layers and the org chart adds links.