|
|
|
|
|
by TechTechTech
1025 days ago
|
|
At the moment of writing the cost estimate for 70B multimodal model with 7T tokens on 1000 H100 GPUs is $18,461,354 with 184 days of training time. Anyone willing to share an estimate how cost will come down each year as hardware keeps improving and possible new methodologies are found? Personally I would not be surprised if it is possible to train the same dataset for half the cost 12 months from now. |
|