|
|
|
|
|
by sixhobbits
3387 days ago
|
|
A 'batch' is how much of the data you put in memory at once while training the NN. To train even a small language model, you'll go through 1000s of batches, so the time difference is way bigger than it sounds. I agree a more practical example would have been nice -- maybe it'll come out in the paper. |
|