Hacker News new | ask | show | jobs
by marcinzm 1300 days ago
In my experience there are differences between clouds so while all have the same basic problem in practice some may be better than others. I've never had issues getting GPUs on AWS but GCP constantly has issues with GPU/TPU capacity.
4 comments

Is this region dependent? In us-east I can’t get them to approve a quota for GPU instance families (G,P) for anything more than 4 CPUs. At one point they rejected my request citing “unprecedented demand”. Of course this is small time, just my personal account.

It is true I can get an instance most of the time, but not if I need >16GiB GPU memory.

We've been having the same problem getting GPU instances on us-east. Multiple week-long delays to escalate and talk to yet the next person up who can make a decision. It's a mess.
There probably are difference occurrence rates. We had to modify how our test suite provisions instances, since we used to regularly run into instance availability constraints on EC2 during the holidays.
I’ve occasionally seen some of the internal AWS capacity management dashboards, and they can frequently be operating very close to 100% on some resource types.
I worked on a project about a year ago where we would have a colleague in a different time zone start instances with 4 gpus because it would almost always be unavailable during regular work hours for us-east