|
|
|
|
|
by hagy
1676 days ago
|
|
I worked at a company that migrated a 100 PB Hadoop cluster to GCP for assorted reasons despite many years of success with colocation. I wasn't involved in any of this, but the team's decision process makes sense. You can read through their decision making in these blog posts: * https://liveramp.com/developers/blog/google-cloud-platform-g...
* https://liveramp.com/developers/blog/migrating-a-big-data-en... One big point was challenges of maintaining multiple colocation sites, with cross replication, for disaster recovery. Since Hadoop triple replicates all data within one DC, this requires 6 times the disk storage capacity of data size for dual DCs. In contrast, cloud object storage pricing includes replication within a region with very high availability such that storing once in cloud storage may be acceptable. Further, you also need double the compute, with one of the DCs always standing by should the other fail. |
|