Hacker News new | ask | show | jobs
by sixhobbits 3387 days ago
A 'batch' is how much of the data you put in memory at once while training the NN. To train even a small language model, you'll go through 1000s of batches, so the time difference is way bigger than it sounds. I agree a more practical example would have been nice -- maybe it'll come out in the paper.
1 comments

my impression was that this was about the time taken to make each prediction, not to train the model? and yep, looking forward to the paper!
It was based on test time prediction, so given you have received a sentence, how fast does it take to compute the prediction with either a bag-of-words or an LSTM.

When you say practical example, would that be in the scenario that you have an API server running? So to consider such costs as latency, data transfer, API overhead etc.?

Thanks for your feedback!