Hacker News new | ask | show | jobs
by repsilat 5399 days ago
The downside is that if a particularly good post in your favourite subreddit hits the front page the Visigoths are going to come in and leave stupid comments one way or another. /r/AskReddit is incredible because of its judicious moderation, but the same can't be said for /r/programming or /r/technology, for example.

What reddit really needs is a ranking engine recommending users, submissions and comments to you based on the users you've friended and the comments and submissions you've upvoted. Just take a quick approximation of PCA/SVD/eigenvalue/tachyon-polarity and let the users see if they like it.

EDIT: Another simple idea is ranking comment trees by their first few comments instead of ranking them just by their first comment. I'm going to read the immediate replies of the top post either way, and I'd rather get three or four good comments than one great comment and two or three crappy replies to it.

3 comments

If you're yet to see /r/AskScience check it out ASAP. One of my favourite discoveries this year. The mods are generally pretty merciless when it comes to memepushers.
Ack, I meant to say /r/AskScience when I said "/r/AskReddit is incredible".
Reddit is now a tree with only one level of nesting. What would help is making the subreddits a graph, and not have all the subreddits trunk in the frontpage.

    What reddit really needs is a ranking engine
    recommending users, submissions and comments 
    to you based on the users you've friended and 
    the comments and submissions you've upvoted.
Do you realize the insane amount of calculations that would take?

For each pageview you would have to find all stories in that subreddit, find all the authors, find all stories you ever voted on, find all the users you are friends with, and somehow rank all this shit.

It will cost millions of dollars a month in CPU time just to run something on the scale of reddit.

Sounds like a challenge :)

First, this doesn't have to be for each pageview. Crudely, you want every story to have a vector in some giant hyperspace characterizing it, and every user to have a vector precomputed offline based on their previously shown affinities, and take dot products.

This is not necessarily easy, or computationally very cheap, but the payoff can be pretty big.

I believe LinkedIn runs something similar to this every night.

I'm not sure what you mean exactly, but hey, if you can pull it off, it might be successful.

Just think of the number of comments up/down voted every millisecond, and that's just one the dimensions. The scale of that matrix would be enormous.

how is it much different from the ranking engines netflix and amazon use?
In those cases, users purely receive recommendations for media/products. In this case, users receive recommendations for other users to follow as well.
You can calculate the "weight" of the movie for a given user once. The number of movies is very low, compared to the number of stories on reddit.

Whatever that guy is proposing is an insanely complex dynamic system.

I'm not saying it's not doable, I'm just saying it's very expensive.

It might be less complicated then you'd expect (at least one way of doing it): http://en.wikipedia.org/wiki/Collaborative_filtering

The core idea is linear regression.

Here's Google's service: http://code.google.com/apis/predict/. Here's a commercial API service: http://www.directededge.com/.