Hacker News new | ask | show | jobs
by emef 1129 days ago
The model weights (the thing being updated by the training process) stay loaded in gpu memory during training (the slow part). This could be useful to serialize the model weights to disk when checkpointing or completed, but it's a drop in the bucket compared to the rest of the time spent training.
1 comments

I meant it more for the image data