Hacker News new | ask | show | jobs
by djoldman 1165 days ago
That may be a much easier question to answer than discovery.

How do you discover relevant new domains?

1 comments

I've actually sort of solved this recently. Marginalia's ranking algorithm is a modified PageRank that instead of links uses website adjacencies[1].

It can rank websites even if they aren't indexed, based on who is linking to them.

Vanilla PageRank can't do this very well. Domains that aren't indexed don't have (known) outgoing links, in the periphery of the rank. There's a some tricks to get these to not mess up the algorithm completely, but they basically all rank poorly. That's even without considering all the well known tricks for manipulating vanilla pagerank. The modified version seems very robust with regards to both problems.

[1] https://memex.marginalia.nu/log/73-new-approach-to-ranking.g...