Hacker News new | ask | show | jobs
by colanderman 4155 days ago
> According to the U.S. Bureau of Labor Statistics, about half of all businesses fail within five years.

> He [...] thinks that even a model that’s only right about 50 percent of the time could help investors and entrepreneurs avoid particularly bad ideas

...does he have a bridge to sell me too? What am I missing?

(One can simply predict "always succeeds" and will be right half the time.)

5 comments

Perhaps what has been missed is the difference between a single coin flip and a combination of coin flips?

Consider one startup.

    f(x) = #fail
succeeds better than 50%, and

    f(x) = #succeed
succeeds less than 50%. This is due to the nature of startups.

Sure, it's easy to get about 50% accuracy for one startup by flipping a coin:

    f(x) = if rand(1) > 0.5 then #fail else #succeed
But consider the case of two companies A and B. There are now four outcomes:

   A = #fail, B = #fail
   A = #succeed, B = #fail
   A = #fail, B = #succeed
   A = #succeed, B = #succeed
If we flip a coin, we have to flip it twice. Our probability that two coin flips match the correct tuple is 25%, and bumping that up to 50% is a massive improvement.

Investors diversify their portfolios. In a portfolio of 100 startups there's probably a winner. Improving the selection of companies means reducing the number of a fund's portfolio companies necessary for a reasonable probability of a winner. More smaller yet successful funds makes capital more efficient.

Better pruning of boolean search spaces has real value. Hence:

    When predicating that a company will fail, 
    he adds, they’re right 88 percent of the time.
That's... not accurate. There are two success conditions (Fail, Fail) and (Succeed, Succeed), and two failure conditions (Succeed, Fail), (Fail, Succeed). If you flip two coins, the chance that they come up both heads is 25%, but the chance that they come up the same is 50%.
The odds of (#fail, #fail) for two startups (A, B) are much greater than 50%. When investing in two companies (#fail, #fail) is not #success.

At a 1% #success probability the failure rate is ~98%. The 1% #success rate is based on the a knowledgeable person choosing A and independently choosing B. If that person can obtain information that lets them improve their selections to %2 #success probability, they can reduce the total number of investments necessary to achieve any particular expected return on investment.

Reducing the number of investments may improve the investor's ability to influencing the outcome of each company in their portfolio, because the investor can allocate more time, energy, and resources to each company in their portfolio [resuming the investor brings business expertise to the table].

The article is claiming that the guy can predict which things are going to succeed and which are going to fail, not make them succeed or fail. His evidence for this is that 50% of the time, he's right (e.g. [success, success], [fail, fail] are both success conditions for him). For him to be adding any information to the system, he has to get it right more often than either choosing randomly or using a fixed zero-information strategy (always bet fail/always bet succeed). You explicitly said that the joint probability matters, but then miscalculated the joint probability of him guessing correctly by random chance.
I apologize for not making myself clear enough to communicate as effectively as might be hoped.
I wrote this below, but several things are clear here:

- This isn't a quote and should be taken with a grain of salt. Oversimplification, poor wording, and basic misunderstanding on the part of the author are at fault.

- We don't know what the models outputs are. If they are simply SUCCEED / FAIL, then yes, 50% correct is not very helpful (unless of course it is right more than 50% of the time on big winners). If the outputs are more granular (likelihood of success, expected ROI, etc), then being "right" means a lot less and, to the extent that it does mean something, being right 50% of the time is much more helpful.

Imagine being right 50% of the time guessing about getting through airport security. If you're guesses are "WILL" or "WON'T", then 50% is terrible. If you're guesses are like "through in 23 min 53 sec" then 50% is incredible. If you're guesses are like "70% of being through in 15-20 minutes", what does "right" mean?

To be fair, "wrong" is not describing everything. If it has a lot of false negatives (says "fail" when startup succeeds), but very few false positives, that would really be worth something.
Because the current rule of thumb is that ~9/10 startups fail. If you can reduce that to a 50/50 bet, you've made quite an improvement.
Not really. What the article says is, the model would predict 50% of time whether a company will fail or not, which doesn't make sense, because 50% for a binary prediction (i.e. fail or not) is exactly nothing.

So maybe it's just bad or confusing wording in the article, the guy actually meant to say something else.

Yeah I'm going with confusing wording, I think he meant what I said above. Not to mention he could be referring to eliminating false positives, which is slightly different to finding true positives versus true negatives.
I think it's easier to relate to a coin flip if we use an "unfair coin".

90% of the time the coin flip returns tails (aka fail).

10% of the time it returns heads (aka win).

For a given coin flip, their algorithm can predict the results 50% of the time. At this point I don't remember the calculations off the top of my head, but it involves a Binomial distribution.

Hmm, I can predict that 90% of the time, so I don't follow.
That prediction doesn't improve over the prior. So its no better than the (biased) coin flip.
> (One can simply predict "always succeeds" and will be right half the time.)

Considering most businesses fail I don't think that's quite right but I get your point.

Then one could simply predict "always fails" and be right more than half the time, which is an improvement on the 50% claimed.