How about quantization? Does tensorflow lite perform quantization or is it tensorflow supposed to do it? Is it iterative process or straightforward? Or are you training quantized models as nn api docs say?
The quantization is done with a special training script that is quantization aware. We will be open sourcing a mobilenet quantized training script to show how to do this soon.