| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jedwhite 1579 days ago

Nice work with the write up and thank you for sharing this. The post is interesting, but I think the problem with the YCRank approach currently is that the labelling appears to be subjective opinion, at least if I understand correctly.

Based on the post, you've trained the classifier by labelling a couple of examples of company descriptions you liked better than each other, based on subjective assessments like "harder to execute" or revenue growth that aren't part of the data you're running the classifier against.

If so, you've done a nice job of training a classifier to predict which companies you personally are more likely to be interested in. To improve this, you could use past YC batch company descriptions and success data to have more useful examples and labels for training the classifier based on past data, and which isn't so subjective. That might produce some interesting predictions that are more generalizable (although I think you may need more data points than the description and basic metadata).

If I've misunderstood, it would be interesting to know a little more detail about how the data was labelled.

I've based this on the following: "To investigate this, I made a neural network, YCRank, trained it on a handful of hand-labeled pairwise comparisons, and then used the learned comparator to sort the companies in the most recent W’22 batch."

And then: "I biased my ranking towards what was “harder to execute” on" and "I also tended to rank favorably companies that were already making monthly recurring revenue with double-digit growth rates".

Those may or may not be good criteria.

Based on that, this is essentially what you could call a "DudeRank Classifier" because as The Dude in the Big Lebowski says, "Yeah, well, that's just like, your opinion, man" :)

As I suggested above, it might be more interesting to label the example pairs and train the classifier based on the original company descriptions of known past successful and unsuccessful YC companies.

Possibly there is some signal in the company descriptions and limited metadata from Demo Day alone sufficient to predict successful companies from a batch.

Good luck!

Disclaimer: I am in the W22 batch. Our startup (Andi) ranks pretty well here. And this also is just, like, my opinion :)

[Edit: You could also test the classifier against historical batches to improve it then also!]

1 comments

ericjang 1579 days ago

> Based on that, this is essentially what you could call a "DudeRank Classifier" because as The Dude in the Big Lebowski says, "Yeah, well, that's just like, your opinion, man" :)

Yes, but isn't human VC investing already just a big DudeRank classifier?

link

yowlingcat 1579 days ago

Yes, in the same way that institutional investing and society is one big "DudeRank" filter. It doesn't mean that there's no structure, it means that you're not looking at the right place.

The difference between a layman investing and a skilled top 10% VC investing is that the latter already stands in very strong position of human and social capital networks, and moreover (if they're actually skilled) has experience materializing that capital into strong 0-1 outcomes. Or being really good at survivor's bias.

link