Hacker News new | ask | show | jobs
by highfrequency 633 days ago
Looks like they are using sixteen $13k GPUs [1] (around $210k hardware) for 6 days of training.

Anyone know the recommended cloud provider and equivalent rental price?

[1] https://www.wiredzone.com/shop/product/10025451-supermicro-g...

3 comments

MI250s definitely aren’t a common card to rent so only can find Runpod at $2.10 per hour each. This results in a training cost of $4838 + fine tuning of $3225. However this doesn’t include the 11TB of storage or time taken to get the setup actually running the tasks. So likely you wouldn’t see much change from $10k usd if any.

- https://www.runpod.io/gpu/mi250

Runpod.io rents the next-gen MI300X's for $4/hr, although since they also rent H100's for $3/hr (that are easier to work with/faster for training) it might be more of a novelty.
I thought the whole selling point of AMD GPUs was that they were a lot cheaper than Nvidia GPUs?
Cheaper for the cloud company. But that doesn’t always translate to cheaper for the end user. Maybe they cost more to run or maybe there’s fewer of them so they’re more expensive to book?
At least a couple years ago, a big advantage of Nvidea cards was how much cheaper they were to run power wise-often the dies that made it into cloud level cards would be binned consumer dies.

Not sure if that’s still the case, but I’d say it’s plausible.

Impossible. Power costs for H100-like cards are dwarfed by the cost of the cards themselves. H100 at full load will consume ~$3500 (rough estimate) of power in 5 years at $0.12/kWh.
Data centers are more constrained by availability of power, and the matching cooling, than the actual bulk cost of it.

For example, I've been in situations where we had to deploy fewer hard drives per server unit than we otherwise would have, just because we knew we couldn't power & cool the racks if we fully stocked them.

Hot Aisle seems to the (only?) place to rent AMD. (Ryan, please don't spam this thread. It's not a good look.)