Y
Hacker News
new
|
ask
|
show
|
jobs
by
eurekin
39 days ago
Batching lowers that, since the model is read once from memory. Activation accumulation doesn't scale as nicely