|
|
|
|
|
by chickenpotpie
2438 days ago
|
|
I've actually been working on a library to help mitigate cloud storage lock-in. The idea is to treat cloud storage providers like disks are treated in RAID. For example, you have 3 separate cloud providers. Cloud providers 1 and 2 have every other byte of data striped across them. Cloud provider 3 has parity data. To pull a file you can only need 2 of the 3 cloud providers. If you don't like how a cloud storage provider is treating you or charging you just pull from the other 2 providers and use them as a backup in case one goes down. You can also just remove them entirely from the equation, but then you have no redundancy if one of the others goes down. It gives you a lot of negotiating power to lower egress costs because you can just pull them out of the equation at any time and reinstate them once you get better pricing. |
|
I assume you didn't mean that literally because I can't see how that will ever work out in terms of cpu cost. I think breaking it up into blocks like what RAID4/5/6 would be better but will still impact the performance of reads.
The performance of writes is going to be worse. Not because of the parity calculation but because you will be taking the max latency over all the cloud providers.
I can't see people trading off that much performance for better fault tolerance (in a world where S3 guarantees 11 nines) or ease of switching.