Y
Hacker News
new
|
ask
|
show
|
jobs
by
born-jre
114 days ago
i think this matters more for lower batch sizes (local llm and private enterprise deployment where there wont be big user at specific time for big batch size) going from mem Io bottleneck to compute.