Working on an app to connect content creators with advertisers, and in my search for content creators I've been pulling a lot of reddit data. Thought this was fairly interesting so dropped it into a table. Unsurprisingly, ~75% of top posts are from image/video hosting domains.
Cool idea. Have you considered applying it to a HN data set?
With several of these sites being image hosts, I'm also curious to see how this will change as Reddit rolls out its own photo/video hosting which I believe is still in beta on limited subreddits today.
This essentially shows the top urls posted each week (since it's a weekly refresh cadence pulling from the 'top monthly' in reddit, which refreshes daily).