| HN Mirror

Thanks for the precisions!

I was referring to TensorRT from Nvidia and TPUs from Google.

One of the strength of the TFLite API is that the same exported tflite model can run on both mobiles and servers. It may make less sense to run lite models on servers, because of the loss of precision but it may also have its own use case for very big models on cheap servers.

Nvidia sells Android devices and embedded boards for robotic, which will surely have some sort of TensorRT-derived cores if not already. Goole could one day integrate their specialized cores (security and TPUs) into their phones too, or into AI-oriented IoT devices.