Hacker News new | ask | show | jobs
by tree_of_item 2656 days ago
> Their model used 256 of Google's Cloud TPU v3, though I've not seen training durations. The TPU v3 is only available individually outside of @Google (though @OpenAI likely got special dispensation) which means you'd be paying $8 * 256 = $2048 per hour.

https://twitter.com/Smerity/status/1096189352743301120

> Thanks. So then it was 32 TPUv3s, to be more precise, and sticker-price training costs would then be per Smerity 32 * 24 * 7 * 8 = $43k?

https://www.reddit.com/r/MachineLearning/comments/aqlzde/r_o...

1 comments

It's likely even more since they needed to perform a hyperparameter tuning. Multiply it by 10 or 100 to get a more realistic estimate.