Hacker News new | ask | show | jobs
by devalier 3890 days ago
Fortunately there's a way to measure bias that's much more reliable, when it can be used....A couple months ago, one VC firm (almost certainly unintentionally) published a study showing bias of this type. First Round Capital found that among its portfolio companies, startups with female founders outperformed those without by 63%.

Except if you want to use statistics to measure bias, you need a statistically significant sample. And actually, if you are studying complex human affairs, with a hundred different variables, you need more than statistical significance, you need a sensitivity analysis. It is similar to nutrition studies. There are so many variables at play that something can always be found to increase or decrease your risk of cancer by 50%. You really only need to pay attention when statistics show an order-of-magnitude correlation, as with the link between smoking and lung cancer.

With the First Round Capital data, they excluded Uber from their calculations, because it would skew everything. If a single data point can switch your findings to be opposite, then you just have to admit that you do not have enough data to make determination one way or another. In science it is sometimes ok to exclude an outlier, since it often indicates a measurement error. But in venture capital, you make most of your money off of the Uber-like outliers. So if you are trying to study the data to be the best venture capitalist possible, throwing out outliers is not valid.

Also, the initial premise is incorrect too. You cannot measure bias by comparing average results, because the average is not the marginal. Consider PG's footnote: "Although I used female founders as an example because that is a kind of bias people often talk about, the most striking thing was the degree to which First Round undervalue founders who went to elite colleges." Does he honestly believe that First Round is biased against founders from elite colleges?

At my last company my sense was that the MIT grads were better than the average programmer. So were we biased against MIT grads? Should we have hired more MIT grads until the average performance of MIT grads overall equaled the average performance of an employee overall? Should we have done more outreach to MIT? Should the industry as a whole hired more MIT grads?

If a talent distribution has a bunch of elite, and then a steep drop-off filled with "pretenders", then you can get this type of effect without being biased.

When we got an elite MIT grad, we hired them. When we got a "pretender", someone who was trading on the name but did not put in the work, we rejected them. And yes, I personally saw MIT grads that did terrible on simple coding exercises.

So even though the average MIT grad we hired was better than the average programmer at our company, there was no way to alter our hiring process to get more MIT grads. If we hired the marginal MIT grad that we rejected, we would have been worse off. Now we could do more outreach to MIT, and we did, but that is a highly competitive process. There were diminishing marginal returns to how much outreach we can do to get more applicants.

The statistical illiteracy of PG's post is simply stunning. Imagine a YC company gets a 100% ROI from PPC ads, and a 50% ROI from banner ads. Are they biased against PPC ads? Should they buy more PPC ads? Such an analysis is ridiculous. You look at what you are spending on the marginal PPC ad, and you stop spending when the ROI on the marginal ad is at zero, regardless of what the average is. That one advertising channel has a higher ROI on average does not mean that the company is biased against that channel.

2 comments

So true.

PG's articles are generally filled with good intuitive insight. Unfortunately, statistics can be very tricky to turn into folksy wisdom. Rules of thumb like "you need 30 samples before you can say anything" that are derived from the CLT are a good example of ones that work well enough in practice, even if they obscure some underlying subtleties. This article is an example of a rule that sounds simple, but actually has so many asterisks that one would expect it to be mostly useless in practice.

If women are performing better on average, it doesn't mean that you should invest in more women necessarily. What if all the remaining candidates would have a negative mean return? If they included Uber and all of a sudden the women now underperform men, does that mean they're biased against men and they need to invest in less women?

There's just so many statistical fallacies at play here that it's a shame that Jessica, Sam, or Geoff didn't point out that maybe someone with a stats background should read the article first before publishing it.

> Does he honestly believe that First Round is biased against founders from elite colleges?

Maybe FR is. Imagine that elite college is highly predictive of success, so you prefer to pick elite college grads, all other available evidence being equal. You're biased toward elite college grads, right?

But what if elite college grads really are phenomenally more successful, and you can't see the detailed reason (high school experience, network, whatever), to the point that they are all better than all non-elite college grads. Then even selecting 90% of your pool from elites, and 10% from the rest, is biased against the actual merit of the applicants.

[these numbers are totally made up. I'm not saying elite college grads really have these characteristics.]

The trickiness is that you can't see everything when you evaluate, so you have to assign weights to the factors you have, and leverage corellations to hidden important factor.