Hacker News new | ask | show | jobs
by londons_explore 1017 days ago
Storage prices depend on the reliability you want.

Someone like Google cannot ever lose data of any customer. So if you pay for 1GB of storage, they probably actually store 5GB of data or more for you. It will be redundantly stored within the datacenter across different racks, but also stored (also redundantly) in different datacenters in case of flood/fire. Theres probably a copy on tape incase of a catastrophic software bug that wipes all the drives. Or two copies on tape because if there were a software bug that wiped all the drives, the chances that every single tape was readable for a restore is low - so more redundancy needed.

However, if you go for a smaller player, they probably still keep multiple copies of your data, but it might be a RAID-5 -like setup, requiring only 1.3GB of storage for each GB you store with them. It can survive a drive failure, but two drive failures or a datacenter fire or an engineer fat-fingering an erase-all command and your data is all gone.

Thats (part of) why the big players charge so much for storage. I actually wish I could choose less reliable yet far cheaper storage option with a big player, but they don't want to offer that because of the PR hit when they do lose customer data.

2 comments

That makes sense and is a factor in our calculations, we store everything at least twice (as in RAID + actual copies not including tape off-site results in ~ 2GB stored for every GB), with at least a third replicated to DR off site backup. That explains the on paper per-GB price difference (which we would expect to be more expensive in the cloud, the main advantage being that we don't have to coordinate all of that, so there are areas where we would save, its just very difficult to do a price comparison given we don't know the details of their system).

It doesn't explain a huge percentage increase though. Presumably they (Google) were already doing due diligence there with respect to reliability.

Essentially agreeing with you on all those points though. Especially the bit about the PR hit. It's a constant factor in our budgeting with the understanding that if you "lose the backups", you are probably out of a job.

> So if you pay for 1GB of storage, they probably actually store 5GB of data or more for you

The actual factor is most likely around 1.4-1.5x and for sure can’t be any more than 2.2x in this day and age. Dumbest possible implementation will be “only” 3x so no it’s nowhere close to 5gb

Edit: looks like it’s public so i can actually tell you that google uses RS 3,2 which gives 1.5 replication factor. When i was there a few years ago storage folks told me they never lost a single stripe of data

Those numbers are for Colossus. Blobstore, which backs the cloud object store, is different, and used to be a lot higher.
Yeah those aren’t public afaik, iirc they adjusted replication for cold data
Presumably Google also keeps backups, though, right?
And mirrors to at least one extra datacenter as they can lose bandwidth with a fiber getting cut, become unreachable due to networking snafu, or even burn down entirely.
Yes they also store to tape
That means 2X at a minimum. Then RAID overhead. Possibly other hot or warm copies ready to take over instantly.
And tape costs same as ssd/spindles in your calculations?
A common problem would be throughput though. Storage capacity scales much faster than access speed. If you are storing an item only 3 times and lets say each storage location gives you 50,000 IOPS max then you can only ever service 150,000 IOPS of this item which might not be enough.
Vast majority of data rarely (if ever) gets read so you use a cache for that
What does RS 3,2 stand for? Thanks
Reed-Solomon erasure coding. 2 data blocks, one parity. Basically raid-6 but distributed
But thats within one cell... But data will be stored in more than one cell to deal with scheduled and unscheduled downtime of the cell...
You have pay more for that
Not with GCS, you don't.

https://cloud.google.com/storage/docs/availability-durabilit...

Are you thinking of Persistent Disk (PD) and Replicated PD?

(I work on storage at Google.)

I was thinking of gcp regions in which case you do have to pay for it. For colossus cells within a single regions you obviously don't but I don't know enough how it maps it out down there and whether it just moves data around in the event of PCR