Hacker News new | ask | show | jobs
by pault 2404 days ago
You could treat a link to the post as a proxy for an upvote, and then rank posts based on how many other posts link to them. Of course you wouldn't be able to charge for this service so you'd have to sell ad space on the front page.
2 comments

That would be an exceedingly thin signal. Very few posts will have incoming links, so you'll get very little scoring data.

The signal will be hugely susceptible to outlier bias -- The Post That Goes Viral, and generates a huge number of incoming links -- will dominate the rankings.

Because the signal is so thin, distorting and manipulation (through link farming) will be cheap and difficult to detect (a small number of links across a large number of sites).

Don't get me wrong: looking at incidental behaviour is useful, and can often be much more beneficial than direct actions. But remember that all of these signals are actually proxies for some ineffible quantity you're trying to measure, quality.

(The very definition of which should leave you crying on the floor after a few hours. Or days. Or weeks. Or months. Or years....)

May minds have attempted this task. All have fail.

Your correspondent included.

(Small site, many moons ago, since surrendered its electrons back to the Great Disk in the Sky.)

If I may go on a tangent, I’ve thought a bit about the problem of link farms and how it might be addressed.

The problem is essentially this: since pagerank is basically the probability that a random walk through the link graph will end up on your site, linking back to yourself, and no other websites, gives a big boost to your pagerank because a random walk will get stuck on your site. Of course it’s easy to just ignore self-links, but you can get essentially the same effect through clique-like groups of websites and this can be more difficult to detect.

What’s interesting is that an algorithm based on how electrical current flows (so a link is a one-way resistor, i.e. a resistor in series with a diode) would not have this problem. Attaching a conductive loop to some point in a circuit does not change how current flows. Electrons don’t get stuck in loops because they don’t drift around randomly, they move from lower voltage to higher voltage.

Tangents are disallowed. But I'll grant you a hyperbolic trajectory.

Link graphs remind me of lightning descending leaders. If you can have a sense of charge potential between cloud and ground, there might be a circuit equivalent which drains largely self-referential link-farms.

Or is that rephrasing your description? My circuit physics / EE-fu is exceedingly weak.

"Or is that rephrasing your description?"

I think so? There is an obvious circuit equivalent where links are interpreted as one-way resistors. You get a charge potential automatically given the circuit and a choice of source / sink nodes (which you need to decide on anyway to apply pagerank).

The notion of a potential & sink seems to be key, and the idea that if you identify a collectively chained circuit, it cannot be both source and sink to itself. Which is what link farms (self-referral) or "mutual admiration societies" are. The question is whether or not you can determine the interconnectedness. Since DNS obscures true ownership relations, that's a challenge.

Cluster analysis generally shows such relations though.

(I'm pretty sure these questions have generated multiple PhDs at Google.)

“The question is whether or not you can determine the interconnectedness.” I’m not sure what you mean by this. The algorithm I'm describing doesn't require any data beyond what is already used by PageRank.

I have a feeling that you think the electrical potential must be defined by some ad-hoc method before applying an electrical algorithm, and this requires fancy techniques like cluster analysis and the like? This is not the case. Let me re-emphasize that you only need to specify the network of resistors (which is the link graph) and the source and sink, and then the potential is defined automatically in terms of those things (the same way as in physical circuits).

My comment was a tongue in cheek reference to early Google. I can't tell if you missed the joke or just one upped me on a heroic scale.
The possibility had occurred.

And I'll claim my prize to the second ;-)

Isn't that PageRank? https://en.wikipedia.org/wiki/PageRank Or did I miss the subtle joke.