Hacker News new | ask | show | jobs
by elbigbad 1196 days ago
The quantization to four hits doesn’t have that much effect on the output. 1 bit might not either, but someone would need to do some testing before making the claim that “1 bit … runs on my RPI3” because “runs” is a bit overloaded to mean “runs and produces sensible output.” I think you’re missing that runs here has that overloading.
1 comments

It should also be mentioned that it isn’t really that each weight is a 4 bit float, but rather that they’re basically clustering floats into 2^4 clusters and then grabbing from a lookup table the float associated to a 4 bit value as needed. So as long as the weights roughly fall into 16 clusters you’ll get identical results