|
|
|
|
|
by SJC_Hacker
445 days ago
|
|
> They used a giant bunch of [data], a year and a half of GPU time to [train] the final model, >[train]: "The training runs on 64 A100 GPUs over nine days", that would be around $18k on lambda labs in case you're wondering How is that a "year and half of GPU time". Maybe on some exoplanet ? |
|
> How is that a "year and half of GPU time".
64 GPUs × 9 days = 576 GPU-days ≈ 1.577 GPU-years