Hacker News new | ask | show | jobs
by 37ef_ced3 1807 days ago
Or, do your inference using an AVX-512 CPU:

https://NN-512.com (open source, free software, no dependencies)

With batch size 1, NN-512 is easily 2x faster than TensorFlow and does 27 ResNet50 inferences per second on a c5.xlarge instance. For more unusual networks, like DenseNet or ResNeXt, the performance gap is wider.

Even if you allow TensorFlow to use a larger ResNet50 batch size, NN-512 is easily 1.3x faster.

If you need a few dozen inferences per second per server, this is the cheapest way. And you're not depending on a proprietary solution whose parent company could go out of business in a year.

If you need Transformers instead of convolutions, Fabrice Bellard's LibNC is a good solution: https://bellard.org/libnc/

1 comments

Oh that's very interesting, how ready for production is it? It only works for TF right?

> If you need a few dozen inferences per second per server, this is the cheapest way. And you're not depending on a proprietary solution whose parent company could go out of business in a year.

Definitely the cheapest way.

We've been in business for more than a year already actually :)

NN-512 has no connection to TensorFlow. It is an open source Go program (with no dependencies) that generates C code (with no dependencies). And it's fully ready for production. Similarly, LibNC is stand-alone, and Fabrice Bellard (author of FFmpeg, QEMU, etc.) will release the source to anyone who asks for it.

I'm giving performance comparisons versus TensorFlow, which I consider to be a standard tool.

People who use your proprietary, closed, black-box service are dependent on the well-being of your business. You could vanish tomorrow.