| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by carbonatedmilk 2053 days ago

I don't understand why the author took the approach he did.

He creates a classifier with 5 label,s and gets 50% accuracy on the cross-val (This may be ok, may be terrible, we don't know), but then only uses the labels for one of those classes. What's the precision/recall on that class?

Also, he's taken a multi-class classification approach to what is essentially a regression task. He's trying to predict the # of upvotes a post will get, and there are approaches that'll do that, but this isn't one of them.

It's fun to see NLP applied to these sort of problems, but this isn't a good example of how to do it well.