Pretty interesting. I love how bespoke data centers are converging along very similar lines. I'm sure if you were inside a Microsoft, Google, Amazon, or Facebook data center you would recognize the new design touch points. The old data center is dead, long live the new data center.
This was the second big thing Google had learned early on: "The equipment is reduced to its basics so it runs cooler. It can also be easily accessed and repaired quickly." -- slide 12/17
The whole sheet metal box around a server was a real waste of time if your employees are the only ones accessing the area, and the only reason they want to touch the machine is to repair it. This in contrast to NetApp (where I had worked before) which was busily designing impressive cabinets that would "stand tall" on the raised flooring of the data center.
I think the lesson came in earlier in the NUMA and MPP machines where they kept trying to cram more stuff on boards that were themselves pluggable into the larger system. This convergence has happened from several directions. It's not all the different from the earlier one that started in the 1960's where they fought cost and inefficiency by getting as few components per box sharing as much as possible. Moores Law temporarily reversed it (transistors and memory are free!) then reality check hits that this seems to be a fundamental principle.
My design a while back was to put it all on PCI cards on a PCI backplane. I saw backplanes that basically look like motherboards full of PCI slots that load into racks. I wanted to make the cards nothing but CPU and memory whose software communicated over efficient networking (not TCP/IP) through PCI DMA. My design had IO/MMU functionality in the backplane or PCI cards. At least one card having full-featured stack for management and at least one I/O card for external interface. I figured the backplane itself could be extended for that, too, with a dedicated port like motherboards do integrated GigE. Management and I/O could come through remote DMA over dedicated wires like many servers do with Ethernet so all the PCI slots could be dedicated to compute.
Dumbest thing about Facebook's model is them destroying drives. The first thing to notice, due to Ross Anderson's Security Engineering, is that those pieces still contain a lot of data if they weren't degaussed first. Next is to remember the fastest way to destroy data: use clustered, encrypting filesystems so that secrets never touch the drive. Then, you just have to delete the keys to loose the secrets. No need to trash the drives at all. The crypto can happen at the storage manager or at hardware interface with HW acceleration available for both types. I'm surprised they haven't already built this with all the smart people they have working on big-data stacks.
To your last paragraph, only relying on forgetting the keys works great, as long as you have absolute 100% confidence in the mechanism used to do that. I read your posts on HN often so I know you know you're quite familiar with defense in depth--I feel that user data is one of those areas where it's ok to do more than one thing to protect the data.
That said, there are a number of systems at FB where deleting a crypto key loses the linked data forever--but they still crunch the hard drives just to be really sure. The drive crunching is an incredibly tiny expenditure compared to the massive CapEx and OpEx required to build, stock, and run the datacenters. It's worth it if only for the peace of mind.
"as long as you have absolute 100% confidence in the mechanism used to do that"
It's true. These mechanisms fail way less than shredders, though. Ideally, the drive encryption would pull KEYMAT from a dedicated system for that somehow on boot (kernel, network, whatever). That system should be medium to high assurance. Easy way is rad-hard ASIC's (or antifuse FPGA's) with ECC RAM and ChipKill that implement a safe-coded protocol engine that moves keys around in memory. These are in high-availability configuration with electrical and optical isolation. Separate box manages things, does backups on encrypted data, etc. A good HSM combo at Level 3 or 4 is already mostly there, though. Remember even Ross Anderson's people couldnt break IBM's outside some stupid, unevaluated software for banking. My ideal just assures protocol itself a bit more.
"I feel that user data is one of those areas where it's ok to do more than one thing to protect the data."
It's fine, except to environmemtalists, to do it extra on top of crypto for extra assurance. By itself, crushing it is insufficient given it might be recovered given just how much data they cram in tiny spaces. It's why DOD/NSA standards were to suck the magnetism out of the platter with qualified degaussers then destroy it. Crypto then destruction can't be directly compared but should also make it hard.
"there are a number of systems at FB where deleting a crypto key loses the linked data forever"
Great they do. Thanks for telling me.
"The drive crunching is an incredibly tiny expenditure compared to the massive CapEx and OpEx required to build, stock, and run the datacenters."
I believe that. What groups like Facebook pull off in datacenter hardware, software, and administration continues to amaze me.
The number of physical connections (power and numerous data cables) that can be seen in the 5th slide that each machine has makes this seem infeasible for now, but considering the article mentions there's "only one technician for every 25,000 servers" and Facebook's FBAR software (hey, that's up from 20k), it seems like the next step in "the new data center" is having a robot to unrack (and rack) entire machines and bring them to a servicing bay.
The hydroelectric dams are taking energy from the water system by slowing its descent. Presumably the friction of faster moving water would heat the riverbeds and surroundings more than the waste heat from the data centers does. The thermal impact should be negligible compared to releasing energy stored in hydrocarbons or atoms. However, pouring a lot of concrete can release enormous amounts of carbon dioxide.
> Presumably the friction of faster moving water would heat the riverbeds and surroundings more than the waste heat from the data centers does
I would be seriously surprised about that. Do you have some data about heat leaked to the sorroundings by water friction?
I would say the potential energy of water is mostly carried by the water itself to its final destination, by slowly heating up during descent. The surroundings are not appreciably heated, since water is a good coolant.
So let's picture a waterfall: water practically stops at the bottom, so potential energy has dissipated somehow, but surroundings are not heated, water is. The energy remains in the water, which continues its happy descent to the sea.
We were thinking along similar lines. Except I just had an amusing observation about how I'm seeing some people telling me the Arctic might melt due to technological activity and another where critical datacenters are being built in the Arctic. One better hope the other isn't right. ;)
I don't think that facebook is good for openness of the web and/or society as a whole, the amount of power without oversight that this Zuck has is simply scary.
Also, I'm all for it when paypayers money go to new struggling but promising startups, fb is not one of them. I can't really see why they would need governmental financial help other that to support corrupt politicians.
I've always wondered this, but why don't we just scale up some data storage device? Like create a massive hard drive, 10 feet across? It has to be more efficient than tons of tiny little drives.
From what I understand, there are a few issues, including:
1) The head has to move further to read the next sector of interest, which is particularly problematic for fragmented data.
2) It is more difficult to manufacture high data density (AKA high-precision) disks in large formats, as some surface defects are cumulative, and get worse as the disk gets larger.
3) Manufacturing defects which occur pseudo-randomly increase proportionally to the surface area of the disk, so the reject rate increases as a square of the radius.
4) Smaller drives can be spun much faster, allowing for higher data rates, as the centripetal accelerations in the disk are proportional to the square of the radius (and I believe the stresses are proportional to the cube of radius).
For these reasons and many others, HDDs have been moving to smaller and smaller form factors.
You know how hard it was to make the lens for the Hubble Space Telescope? Now imagine that you need to make it even more fine-grained in precision and accuracy as well as being electromagnetically within tolerance across that whole surface.
The cost of making high-precision surfaces goes up exponentially with the surface area of each unit. It's much cheaper to make lots of smaller ones than one large one.
The same probably applies for things like the actuator for the heads, etc.
This was the second big thing Google had learned early on: "The equipment is reduced to its basics so it runs cooler. It can also be easily accessed and repaired quickly." -- slide 12/17
The whole sheet metal box around a server was a real waste of time if your employees are the only ones accessing the area, and the only reason they want to touch the machine is to repair it. This in contrast to NetApp (where I had worked before) which was busily designing impressive cabinets that would "stand tall" on the raised flooring of the data center.