Hacker News new | ask | show | jobs
by dawatchusay 854 days ago
I believe in the article Huang claims it’s not that expensive
1 comments

That claim is wrong. Training even a very basic model like TinyLlama takes 90 days on 16 x A100 GPUs [0]. A p4 box containing 8 X A100 GPUs costs $32/hr [1].

So 90 days * 2 p4 boxes * 24 hours * $32/hr = $138,240

Larger models run into the millions of dollars.

Though I suppose it all depends on your personal definition of expensive.

[0] https://github.com/jzhang38/TinyLlama [1] https://aws.amazon.com/ec2/instance-types/p4/

millions of dollars is a rounding error in almost any country's budget.