|
Thanks for the write-up Chris. Now I understand why I couldn't follow the path of logic you were laying out in our original discussion in PG's article's comments. The main problem I was having is that you are assuming our observation variable is the latent skill or potential value variable (which you're calling x here). However, the article by PG was talking solely about the average of returns (let's call it y). So the reason I was confused is that, assuming that the outcome of a startup is dependent only on x, we are really observing y ~ f(x) = \int_0^1 g(x)h(x)dx, where h is your cut-off criteria for x, g(x) is some unknown payoff distribution for a given skill level, and I'm assuming our x is in [0,1] without loss of generality. So in essence, the real problem here, even if you could see all of the individual returns for a given portfolio, is that you have to perform a very, very difficult deconvolution problem. And I'm pretty sure it's non-identifiable without some other information or additional parametric assumptions. Thinking out loud a bit, let's assume that y is actually log(return), where a return of 1 is breaking even and 0 is losing everything. Since log(0) is undefined, most startups return 0, and very few exit for less than 1, I would think we could model this as a point-inflated normal distribution: p(y) = c * \delta_0 + (1-c) * N(\mu, \sigma^2). Given this, we could then model our latent parameters (c, \mu, \sigma) as being functions of x. Since the model is separable, we can even just look at the zeros and non-zeros in isolation. Then we can come up with a test from there, but I'm not really sure what that test would be at this point. Anyway, that's a completely different line of thinking, but it seems much more tractable in practice. |