Y
Hacker News
new
|
ask
|
show
|
jobs
by
Chamix
948 days ago
The little secret is that the training run (meaning, creating the raw autocompleting multimodal token weights) for 5 ran in parallel with 4.