|
|
|
|
|
by ur-whale
2890 days ago
|
|
a) that's just for inference, you don't train with that. b) a fully float-trained model "quantized" to int16 typically loses overall precision, but often works well enough. It's also usually faster (if implemented properly). c) there's a version where you go all the way down to int1 (bits) and binary ops instead of addmuls on floats and ints. It can solve some problems. And properly compiled, it's wicked fast. |
|
There's also a Zen version that uses just 0.5 bits. </joke>