This is terrible research. The results are meaningless.
At best, it's a case of confirmation bias. The most highly-differentiated startups don't raise anything and no one ever hears about them. Crunchbase doesn't capture the majority of these, so the data used here isn't representative.
Then they use NLP to determine how different the products are.
Those of us who get pitched by SaaS vendors all the time know that two companies can do the exact same thing and have totally different product descriptions. Salesforce started talking about IoT for a while, and they still bill themselves more as an ERP when most people still use them as a CRM.
There's also the issue of those who pivoted. Most startups pivot, and the author admits to ignoring pivots altogether!
> "2. Found the websites of all these companies, in the year each startup was launched (using WaybackMachine, a historical archive of web pages)."
> "3. Scraped the text on these websites and ran them through a natural language processing (NLP) machine learning algorithm (doc2vec)."
> "4. Measured how different the startups’ value propositions were to those of competitors (e.g. focused on a niche product feature vs others that talked about price)."
> Startups can have a more or less differentiated value proposition from their competitors. For example, a company that sells iPhone cases is not very differentiated. One that sells iPhone cases with a built-in earbud holder is more differentiated.
I mean this makes sense, although I worry this is much too broad of a definition to glean data off of, especially when looking in hindsight when it's not as obvious how "differentiated" the company was when they first pitched compared to the rest of the market
Pet peeve: does "117% more" mean 1.17 times the expected amount, or 2.17 times the expected amount?
I avoid using this kind of terminology in my writing because I'm never completely sure that my readers will share the same exact definition of how those numbers should be interpreted.
I completely agree with you, but I don't think it's ever ambiguous. I think it always means 2.17 times the expected amount. If a normal startup raises $100, then a highly differentiated startup will raise $217. That's because the article could have been "X raises 20% more", in which case they'd get $120 over the $100 baseline. Basically, I think "more" is the signifier that you don't multiply by 1.17. If that's missing, the number is meaningless.
Yeah, sorry, I meant to say I totally agree with you. I mentally explain this to myself every time I see it; 1 hour to understand what the % means, 5 seconds to actually read the article. It's terrible.
The latter is the standard meaning. If the author means something different, that means the author is being unclear and it shouldn't be on the reader to parse the author's nonstandard intentions.
From a common sense perspective: If I say something contains "100% more whatever", you would expect it to contain double what it normally contains.
At best, it's a case of confirmation bias. The most highly-differentiated startups don't raise anything and no one ever hears about them. Crunchbase doesn't capture the majority of these, so the data used here isn't representative.
Then they use NLP to determine how different the products are.
Those of us who get pitched by SaaS vendors all the time know that two companies can do the exact same thing and have totally different product descriptions. Salesforce started talking about IoT for a while, and they still bill themselves more as an ERP when most people still use them as a CRM.
There's also the issue of those who pivoted. Most startups pivot, and the author admits to ignoring pivots altogether!
> "2. Found the websites of all these companies, in the year each startup was launched (using WaybackMachine, a historical archive of web pages)."
> "3. Scraped the text on these websites and ran them through a natural language processing (NLP) machine learning algorithm (doc2vec)."
> "4. Measured how different the startups’ value propositions were to those of competitors (e.g. focused on a niche product feature vs others that talked about price)."