Hacker News new | ask | show | jobs
by sabareesh 539 days ago
Roughly it takes 7 days to train on 100B tokens on 500M model