Hacker News new | ask | show | jobs
by ss7pro 2642 days ago
Have a look here: https://github.com/IntelAI/OpenVINO-model-server/blob/master... You can replace tf-serving with OpenVINO to get even better performance and latency when running on CPU
1 comments

What useful models run at decent speed on a CPU these days?

Even basic image classifiers tend to be 100x faster on a GPU or TPU...

Inference is not that super slow on CPU, especially for network requests that already have quite a bit of latency, so plenty of companies use CPUs on the cloud for lambda/flexible loads where GPUs aren't available.