Hacker News new | ask | show | jobs
by pork 5346 days ago
At 500+ mil users with an average of 150 connections, you're looking at 500,000,000 * 150 / 2 = 3.75e10 edges. Assuming generously that each edge can be stored with a 4 byte unsigned int, you're looking at about 140 GB. I haven't seen any scrapes that even come CLOSE to that, an that's ignoring throttling and privacy controls.

Edit: more likely, you'd get the data as (id1, id2) pairs with 8 byte longs for each id. That's about 600 GB.

1 comments

This is an ignorant question, but do apps have access to the social graph? If so, I'd assume the top game applications have a substantial portion.
You can get names and Facebook IDs of user's friends for anyone who signs in with your app. I don't believe you can fetch their friends through the API, so the game companies should have access to only edges in which one person uses a game.