Y
Hacker News
new
|
ask
|
show
|
jobs
by
layer8
534 days ago
How long does training a 1B or 500M model take approximately on the 4-GPU setup? Or does that dramatically depend on the training data? I didn’t see that info on your pages.
1 comments
sabareesh
534 days ago
Roughly it takes 7 days to train on 100B tokens on 500M model
link