Hacker News new | ask | show | jobs
by fa 4137 days ago
This seems like a dramatic illustration of cumulative advantage, as studied in the famous MusicLab experiment [1, 2]. The argument is that in cultural and social markets, random effects govern which products or artifacts get the initial few "upvotes" (or their analogs), at which point the rich-get-richer dynamic takes over. Very, very awesome.

[1] Salganik, Dodds, Watts, "Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market", 2006: https://www.princeton.edu/~mjs3/salganik_dodds_watts06_full....

[2] A popular article by one of the authors, the inestimable Duncan Watts: http://www.nytimes.com/2007/04/15/magazine/15wwlnidealab.t.h...

1 comments

At the risk of plugging my own work, I did a follow-up to the (really awesome) Watt's experiment using data from reddit and Hacker News:

http://arxiv.org/abs/1501.07860

The work isn't complete yet (always more to do) but the TL;DR is:

1. Yes, randomness governs a lot of article outcomes. Whether something hits the front page or not is pretty arbitrary.

2. However, conditioned on making the front page, popularity is actually a good reflection of "intrinsic quality". I think the ultimate relationship between popularity and quality is stronger than the MusicLab experiment suggests.

I like this a lot! If I allow myself explanation, I'd say your findings make a lot of sense, but I will instead heed the title of Duncan Watts' latest, "Everything is Obvious*: Until You Know the Answer" :)

I wonder how much engagement you'd get if you made a browser plugin or even an alternative website that showed users a random selection from the top three pages of reddit/HN, then intercepted and logged their upvotes, to get a direct measure of intrinsic quality, rather than estimating this statistically. I for one would use such an interface.

On a sidenote. Have you seen this work in predicting the growth of ongoing cascades from Facebook [1]? I'm fixing to see if their findings apply to the MusicLab data.

[1] https://research.facebook.com/publications/680551081983090/c...

Thanks for the feedback (and the Watts' reference). Building a plugin is definitely an interesting idea; I hadn't thought of that before. I guess the problems would be three-fold. First, I don't know how to do that :-). Second, there are probably a bunch of ethical concerns/IRB issues that would stand in the way of academic publishing (but thats not huge). Third, and the only fundamental issue, is that the self-selection into using that plug-in would bias the estimates of intrinsic quality. Still its a pretty good idea but I'm currently trying to get access to more fine-grained data in other ways, so we'll see.

In terms of creating your own site, a few researchers [1,2] have already done this and have some interesting work. But even with that, you still have the problem of accounting for position bias within the site (like HN doesn't really know if you skimmed the title of an article and decided to ignore it, or never read the title at all). But the experimental power you get with that is pretty cool.

And I have totally read that Facebook cascades paper and have more than a few thoughts about it. In fact, I have adapted their prediction-style results to the MusicLab data and you get really strong predictive accuracy (like 90-95% in terms of predicting whether a song will eventually be above the median popularity). However the accuracy you would achieve on Reddit or Hacker News data is considerably lower. I didn't really include those results in my paper because I'm not sure how they fit yet.

If I had one critique (which is not really a critique but a comment) is that the Facebook study doesn't really contradict Watts' point that popularity is hard to predict. The Facebook study shows that if you can observe the "initial conditions", then you can predict eventual outcomes pretty well but that's directly in line with the rich-get-richer effect that Watts et el demonstrate. To put it pithily, its easy to predict who gets richer if you observe who is rich.

Anyway, I could geek out about this for a long time but feel free to drop me an email at stoddardg [at] gmail.com if you're interested in chatting some more.

[1]: http://arxiv.org/pdf/1410.6744.pdf [2]: http://journals.plos.org/plosone/article?id=10.1371/journal....