Hacker News new | ask | show | jobs
by fourspace 4781 days ago
As mentioned in another comment, I managed a Google SRE team that help run Borg for 5 years. I'm good friends with several of the folks who are rebuilding it using Mesos at Twitter. They are all insanely awesome people.

Let's say you have a Rails stack (app server + DB) that you want to deploy for testing in 3 different datacenters. If that works out, you then immediately need to deploy 10,000 instances each to 10 different datacenters. Oh yeah, and the storage needs to be distributed and universally available, in case any of the servers crash. Performance testing reveals that you can't have more than 10 servers per rack, otherwise you saturate the rack switch. You also need to account for power distribution redundancy, shifting traffic loads, etc.

Oh, and you want to do this with a single configuration that's manageable by a team of 3-4 people and have deployment be entirely automated and monitored.

The complexities behind this problem are simply enormous, almost too much to even comprehend. I am proud of the work we did at Google to attempt (yes, attempt) to solve this problem, and I know my friends at Twitter are doing great work as well. I'm mostly just happy that I can finally talk about it and that the badasses that do this work can get some credit.

1 comments

I think that in the end, Google's greatest gift to the world will be all the experience and knowledge that they put into the minds of the world's best engineers that worked on big stuff there over the last ten years or so.

Those people are building truly amazing things now based on that (for lack of a better term) incubator for the wider industry.

The stuff that has been built directly by Google is tremendous, to be sure. But the fundamental understandings that have come out of their R&D to do it will benefit everyone across the entire industry. It's magical to watch.