| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Der_Einzige 2060 days ago

2 things

1. Why was this formed as classification instead of as regression? Seems much more like a regression problem (predict how many upvotes you will get given a reddit post)

2. Seems like it could have been effective with better pre-training. I'd love to see the author rerun the experiments with the likes of GPT-2 or other pre-trained models vectorizing the text first.

1 comments

scollet 2060 days ago

Wouldn't you have to bucket and rank content and subreddits, respectively, anyways?

Otherwise you will probably have a none-size fits all algorithm.

Seems like an incremental decison tree problem with weights around karma and time of day.

link