a) that's just for inference, you don't train with that.
b) a fully float-trained model "quantized" to int16 typically loses overall precision, but often works well enough. It's also usually faster (if implemented properly).
c) there's a version where you go all the way down to int1 (bits) and binary ops instead of addmuls on floats and ints. It can solve some problems. And properly compiled, it's wicked fast.
b) a fully float-trained model "quantized" to int16 typically loses overall precision, but often works well enough. It's also usually faster (if implemented properly).
c) there's a version where you go all the way down to int1 (bits) and binary ops instead of addmuls on floats and ints. It can solve some problems. And properly compiled, it's wicked fast.