|
|
|
|
|
by georgestephanis
2501 days ago
|
|
I have no idea precisely what we're going to do or how, but if I were spitballing, I'd think something like ... Currently I think Tumblr stores all posts across all sites in one big table? WP.com does different tables per site. I also think Tumblr's post ids are often above php's int max for 32-bit systems ( 2,147,483,647 ) -- I know I've seen some issues trying to parse tumblr's post ids to integers rather than strings on some old servers years ago. For an overview of how our systems are run here's Barry, our head sysadmin, talking about six years ago on how the wpcom infastructure is structured: https://www.youtube.com/watch?v=57EJ8KDDBH0 It's changed somewhat, but not much conceptually. It's a really fascinating talk and I'd encourage anyone curious on massive scale data to give it a look, and see precisely what can be managed with determination and mysql and coffee. |
|
Thankfully nothing is 32-bit so no worries about integer overflows. That would cause huge headaches everywhere on the PHP side. In MySQL, a regular unsigned INT column does have that limitation (roughly max 4.5 billion for unsigned, 2.2 signed), so BIGINT must be used there (Twitter had to do the same). Where it gets interesting is PHP doesn't support unsigned integers, so with 64 bit your max integer in PHP is 9,223,372,036,854,775,807, whereas in MySQL an unsigned 64 bit int is double that. I think it's safe to say though that neither Tumblr nor WordPress, even combined, would ever have more posts than atoms on Earth =)