Hacker News new | ask | show | jobs
by sophiebits 462 days ago
Batched often means a higher latency option (minutes/hours instead of seconds), which providers can schedule more efficiently on their GPUs.