Hacker News new | ask | show | jobs
by bglusman 3972 days ago
But isn't this the point of removing outliers, to avoid a single data point overly clouding the significance? To be fair, by similar logic they should arguably remove one or more of the least successful businesses, but all failures are generally 'equally unsuccessful' but no two successes are equivalent.
2 comments

The whole point of startup investing is to search for outliers. The way returns are distributed in the tech industry, it's not unusual for 1-2 companies to be responsible for 90%+ of a fund's financial returns.

http://www.paulgraham.com/swan.html

Including Uber would probably have made most of the data meaningless - since their conclusions are valuation-weighted, their data would show that the ideal startup founder is...Garrett Camp. But then, that's how the startup investing business actually works - your data is useless unless you find the one outlier that everyone else missed.

Edit: It occurs to me that this effect could be overcome by taking the log of valuation (or whatever metric is of interest) and then running your statistics over that. That's standard procedure when trying to do statistics over a Zipfian or other power-law distribution; it lets the outliers count, but prevents them from distorting the averages too much.

The mean (or average) is a good choice for data with a normal distribution. However, if your data has extreme scores, such as the difference between an Uber and everyone else, you should look at the median or 90th percentile, because it's much more representative of your sample.
Median and 90th percentile are still pretty meaningless for the question that First Round is asking, notably "If I want to maximize my financial returns, what qualities should I look for in founders?" Miss that one company at the 99th percentile, and your return could be 10x lower.
It's still relevant to founder who want to know what it takes to be in the top decile of their cohort.
That's why they should have used other metrics than average.
if you remove a handful of outliers (jesus, buddha, mohammed, joseph smith, maybe two dozen in total) then nobody has ever formed a religion that got any significant number of followers.

In other words, if you remove the outliers you're now looking at something basically meaningless, like evaluating a McDonald's meal by drinking the soft drink only - everyohe else other than the outliers is the soft drink, and the outliers are the main meal.