Hacker News new | ask | show | jobs
by ur-whale 2890 days ago
a) that's just for inference, you don't train with that.

b) a fully float-trained model "quantized" to int16 typically loses overall precision, but often works well enough. It's also usually faster (if implemented properly).

c) there's a version where you go all the way down to int1 (bits) and binary ops instead of addmuls on floats and ints. It can solve some problems. And properly compiled, it's wicked fast.

1 comments

> there's a version where you go all the way down to int1 (bits)

There's also a Zen version that uses just 0.5 bits. </joke>