Hacker News new | ask | show | jobs
by Chamix 948 days ago
The little secret is that the training run (meaning, creating the raw autocompleting multimodal token weights) for 5 ran in parallel with 4.