| Disclaimer: I work for Google Cloud > "Google probably has the best networking technology on the planet." How do we quantify this? In the article they did a bunch of tests.
Quote: GCP does roughly 7x better for the comparison of 4-core machines, but for the largest machine sizes networking performance is roughly equivalent. There is also https://github.com/GoogleCloudPlatform/PerfKitBenchmarker if you want to benchmark things yourself. Seriously, try it yourself. I think you will be pleasantly surprised. > I would much rather create a service that can tolerate single node outages than relying on "live migrations". Services should tolerate node failure even on GCP, live migration does not really help with that. It's more about reducing ops. With AWS, you have to manually reboot your machines when a infra upgrade happens. With GCP it is automatic. > I am not sure what he meant by the SSD comparison, Amazon EBS that can be SSD but still it is a network mounted storage. I'm not too sure what your question is? > Discarding Azure was purely arbitrary Agreed, would love to know more about why they didn't consider Azure |
1. AWS does explicitly tells you up front that smaller instance sizes come with smaller network throughput. This is well known and well communicated even when you browse the instance offerings. Doing 7x better for a 4 core instance is hardly relevant (depending on the actual CPU type though), being able to saturate your pipe would probably consume much of your CPU time and you could hardly do anything else on the box. You can prove me wrong on this one. Synthetic benchmarks are not really relevant for production use cases.
A good read in the subject: http://www.brendangregg.com/activebenchmarking.html
2. On reducing OPS. You are implying that these OPSy things are not automated. You should ask your SRE co-workers about this one. For running a website this scale, you absolutely need to automate cases when the server is rebooted. Meaning, on shut down it needs to remove itself from the load-balancer or from the resource pool, and when it comes back it has to put itself back. Worst case scenario you can just terminate the instance and let auto-scaling do its job. All of these are completely human attention free operations in most cases, but I do understand that some smaller customers are not so advanced with automation and GCP might be optimizing for those clients.
3. I do not have any question, as I pointed out that in the article the author is talking about EBS while it might appear to the reader that he is talking about some sort of local SSD.
4. Great! I would like to know it too! We should petition together. :)