Hacker News new | ask | show | jobs
by bitL 2645 days ago
Inference is not that super slow on CPU, especially for network requests that already have quite a bit of latency, so plenty of companies use CPUs on the cloud for lambda/flexible loads where GPUs aren't available.