Hacker News new | ask | show | jobs
by hagy 1676 days ago
I worked at a company that migrated a 100 PB Hadoop cluster to GCP for assorted reasons despite many years of success with colocation. I wasn't involved in any of this, but the team's decision process makes sense. You can read through their decision making in these blog posts:

* https://liveramp.com/developers/blog/google-cloud-platform-g... * https://liveramp.com/developers/blog/migrating-a-big-data-en...

One big point was challenges of maintaining multiple colocation sites, with cross replication, for disaster recovery. Since Hadoop triple replicates all data within one DC, this requires 6 times the disk storage capacity of data size for dual DCs. In contrast, cloud object storage pricing includes replication within a region with very high availability such that storing once in cloud storage may be acceptable. Further, you also need double the compute, with one of the DCs always standing by should the other fail.

1 comments

HDFS supports RS/XOR erasure coding which gives you same fault tolerance guarantees as 3x at much lower replication factor. This is essentially the same method aws/gcp use under the hood - there’s no magic involved here
Can I use this over NFS or samba or is it just for Java and Hadoop?
Yes you can* for new projects I would go with ceph+nfsv4

*) https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/ha...

ceph+nfs was actually what i was going to do.

Probably some learning pains but at least i don't have to "lose" half my storage because the sizes aren't paired with anything.