Hacker News new | ask | show | jobs
by bit-player 3743 days ago
This is an important point. The Google white paper does address it, though only in the abstract: They want to maximize both capacity and IO bandwidth. One can imagine large disks with multiple independently actuated head assemblies. But quite possibly you're right that it's not worth the both.
1 comments

The interesting thing about Google's storage infrastructure was that teams were optimizing IOPs per drive and talking to thousands of them over a gigabit link. I had an interesting conversation with Sean about that one day, asking him if he got 100 IOPs per drive, and had 10,000 drives, and a gigabit ethernet port, how much data on the drive could be part of any service being provided over that gigabit link? Plot that over a RPS (generic 'requests per second').

In my case I was trying to get him to sign off on powering down some of the drives that could not be reached to save power. But even with the data staring him in the face he could not go there. Network bandwidth gets better, and that exposes more data to the pipeline, but if you want < 500mS request response you have to balance the system.