Hacker News new | ask | show | jobs
by obulpathi 3214 days ago
Adding more context. Sorry for missing it out in first place. I mostly work in Big Data Space. Google Clouds Big Data stuff is built for Streaming / Storing / Processing / Querying / Machine Learning at Internet Scale data (PubSub / Bigtable / Dataflow / BigQuery / Cloud ML). AWS scales to terabyte level loads. But, beyond that, its hard and super costly. Google's services autoscale to Petabyte levels / millions of users smoothly (for example BigQuery / Load Balancers). On AWS, it requires pre warming / allocating capacity beforehand and that costs tons of money. In companies working at that scale, that usual saying is "to keep scaling, keep throwing cash at AWS". This is not a problem with Google.
2 comments

You need petabyte scale for the NYT Crossword?
Quoting from the article: "This accomplishment would not have been possible for our three-person team of engineers to achieve without the tools and abstractions provided by Google and App Engine."

Talking about the use case from the article, they release the puzzle at 10 and need to have infra ready to serve up all the requests. On AWS, you need to pre warm load balancers, increase the quota of your Dynamo DB, scale up instances so that they can withstand the wall of traffic, ... and then scale down after the traffic. All this takes time, people and money. Adding few other things author mentioned: Monitoring/Alerting, Local Development, Combined Access and App Logging ... will take focus from developing great apps to building out the infrastructure for apps.

I concur.

Currently, I am working on projects that use both Amazon and Google clouds.

In my experience, AWS requires more planning and administration to handle the full workflow: uploading data; organisation in S3; partitioning data sets; compute loads (EMR-bound vs. Redshift-bound vs. Spark (SQL) bound); establishing and monitoring quotas; cost attribution to different internal profit centres; etc.

GCP is - in a few small ways - less fussy to deal with.

Also, GCP console - itself not very great - is much easier to use and operate than AWS console.

Of course, YMMV!

Could you please post the URL for the resource and the number of hits it receives? I'm interested in high load websites and I have a hard time picturing how this could lead to petabytes.
The impression I'm getting is not that GCP scales better, but it scales with less fuss - the anecdotes here all suggest that with AWS, once you hit any meaningful load (i.e. gigabytes), you need to start fiddling with stuff.

I don't know if this is actually true, I've never done any serious work in AWS.

Hold on, please do not say google cloud scales well, yes they do have services that make a ton of claims, but unlike AWS, things don't work as promised which is magnified by the fact that their support is way worse.

Additionally, Big Query is far more expensive than Athena, where you have to pay a huge premium on storage.

The biggest difference is that what amazon provides you in infrastructure, where as google provides you a platform. While app engine is certainly easier to use than elastic bean stalk, you have very little control over what is done in the background once you let google do its thing.

GCP support can sometimes be bad but these other claims don't add up. What isn't working as promised? BigQuery can do a lot more than Athena and it's storage pricing is the same or cheaper than S3.

We've used 5 different providers for a global system and GCP has won by both performance and price. We still use Azure and AWS for some missing functionality but the core services are much more solid and easy to deal with on GCP, which is also far more than just app engine.