That claim is wrong. Training even a very basic model like TinyLlama takes 90 days on 16 x A100 GPUs [0]. A p4 box containing 8 X A100 GPUs costs $32/hr [1].
So 90 days * 2 p4 boxes * 24 hours * $32/hr = $138,240
Larger models run into the millions of dollars.
Though I suppose it all depends on your personal definition of expensive.
So 90 days * 2 p4 boxes * 24 hours * $32/hr = $138,240
Larger models run into the millions of dollars.
Though I suppose it all depends on your personal definition of expensive.
[0] https://github.com/jzhang38/TinyLlama [1] https://aws.amazon.com/ec2/instance-types/p4/