Y
Hacker News
new
|
ask
|
show
|
jobs
by
calaphos
414 days ago
There's still a throughput/latency tradeoff curve, at least for any sort of interactive models.
One of the reasons why inference providers sell batch discounts.