|
|
|
|
|
by bglusman
3972 days ago
|
|
But isn't this the point of removing outliers, to avoid a single data point overly clouding the significance? To be fair, by similar logic they should arguably remove one or more of the least successful businesses, but all failures are generally 'equally unsuccessful' but no two successes are equivalent. |
|
http://www.paulgraham.com/swan.html
Including Uber would probably have made most of the data meaningless - since their conclusions are valuation-weighted, their data would show that the ideal startup founder is...Garrett Camp. But then, that's how the startup investing business actually works - your data is useless unless you find the one outlier that everyone else missed.
Edit: It occurs to me that this effect could be overcome by taking the log of valuation (or whatever metric is of interest) and then running your statistics over that. That's standard procedure when trying to do statistics over a Zipfian or other power-law distribution; it lets the outliers count, but prevents them from distorting the averages too much.