Hacker News new | ask | show | jobs
by daniel_levine 5580 days ago
I worked at TechCrunch for a year focusing mainly on CrunchBase data analysis.

There are a number of problems with your analysis:

I suspect you are using founding year. Unfortunately there is a tremendous lag between when a company is founded and when it is entered into CrunchBase. The same is true of funding data in particular where we saw only around 20% of fundings within a quarter of happening and only about 70% a year out. That is due to CrunchBase's continued growth (it's much better known now) as well as a natural reporting lag.

Second, CrunchBase is a very new product and as it turns out data is only reliable as far back as 2007 and even that took a lot of work. Some time has been spent pushing to get more accurate data further back but it is scattershot at best.

(Possibly, can't tell) CrunchBase investments are stored in a number of currencies, did you make sure to recalculate them? Yen can really cause problems :)

Lastly, your NASDAQ chart is from 94 - 2005 which never overlaps with reliable CrunchBase data even by your own admission. I suspect that graph will be a bit more telling and worrisome potentially: http://www.google.com//finance?chdnp=1&chdd=1&chds=1....

I do not necessarily think we are in a bubble and I am happy to see people diving in on data I just wanted to point these things out as it would be irresponsible not to.

1 comments

Wouldn't it also be the case that well funded startups and super angels are much more likely to disclose sources and levels of funding to TechCrunch at an early stage for the PR than they were back when TC was a popular blog rather than a big-name publisher?

Presumably there are also some sort of editorial policies over what sort of startups merit inclusion in CrunchBase? The 2010 drop in startups covered by could easily be a reflection of reduced interest in covering smaller startups that don't effectively court TC and don't disclose relevant funding data.

I'm not sure the amount or willingness of sources has changed that much since the AOL acquisition though it is possible.

But there is definitely a huge selection bias of contributors. People love to disclose when they invested in the hot startup and neglect to mention their big mistakes retroactively.

I suspect the biggest reason for the drop is just a smaller team and less commitment like the OP said. There has been a lot of headcount flux at TC for awhile and even more so since the AOL deal.