|
|
|
|
|
by 37ef_ced3
1807 days ago
|
|
Or, do your inference using an AVX-512 CPU: https://NN-512.com (open source, free software, no dependencies) With batch size 1, NN-512 is easily 2x faster than TensorFlow and does 27 ResNet50 inferences per second on a c5.xlarge instance. For more unusual networks, like DenseNet or ResNeXt, the performance gap is wider. Even if you allow TensorFlow to use a larger ResNet50 batch size, NN-512 is easily 1.3x faster. If you need a few dozen inferences per second per server, this is the cheapest way. And you're not depending on a proprietary solution whose parent company could go out of business in a year. If you need Transformers instead of convolutions, Fabrice Bellard's LibNC is a good solution: https://bellard.org/libnc/ |
|
> If you need a few dozen inferences per second per server, this is the cheapest way. And you're not depending on a proprietary solution whose parent company could go out of business in a year.
Definitely the cheapest way.
We've been in business for more than a year already actually :)