|
|
|
|
|
by 37ef_ced3
1994 days ago
|
|
AVX-512 neural net inference on inexpensive, CPU-only cloud compute instances. GPU cloud compute is almost unbelievably expensive. Even Linode charges $1000 per month, or $1.50 per hour (look at the GPU plans: https://www.linode.com/pricing/#row--compute) An AVX-512 Skylake-X cloud compute instance costs $10 per CPU-core per month at Vultr (https://www.vultr.com/products/cloud-compute/), and you can do about 18 DenseNet121 inferences per CPU-core per second (in series, not batched) using tools like https://NN-512.com As AVX-512 becomes better supported by Intel and AMD chips, it becomes more attractive as an alternative to expensive GPU instances for workloads with small amounts of inference mixed with other computation |
|
Or is it the case that if you virtualised a GPU up into tiny pieces, the memory-to-flops ratio would be way off what's needed for inference? Or the virtualisation overhead would be too big?
Those are all genuine questions, just to be clear - this is not my area of expertise.