Y
Hacker News
new
|
ask
|
show
|
jobs
by
groodt
1165 days ago
Thanks for publishing this. I quickly skimmed the paper, I saw the impressive linear scaling as you scaled to 16 nodes. How long did it take to train the various models in wall clock time?