Y
Hacker News
new
|
ask
|
show
|
jobs
by
bitL
2645 days ago
Inference is not that super slow on CPU, especially for network requests that already have quite a bit of latency, so plenty of companies use CPUs on the cloud for lambda/flexible loads where GPUs aren't available.