| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dna_polymerase 2891 days ago
	Int8 and Int16? I never worked with quantized models, anyone mind sharing their experience? Do such models achieve state-of-the-art performance?

2 comments

ur-whale 2891 days ago

a) that's just for inference, you don't train with that.

b) a fully float-trained model "quantized" to int16 typically loses overall precision, but often works well enough. It's also usually faster (if implemented properly).

c) there's a version where you go all the way down to int1 (bits) and binary ops instead of addmuls on floats and ints. It can solve some problems. And properly compiled, it's wicked fast.

link

DoofusOfDeath 2891 days ago

> there's a version where you go all the way down to int1 (bits)

There's also a Zen version that uses just 0.5 bits. </joke>

link

dekhn 2891 days ago

We lose a tiny bit of accuracy (quantizing for Android Tensorflow Lite), that's about it. I was pretty impressed.

link