|
|
|
|
|
by lynaghk
404 days ago
|
|
Awesome, thanks! This is exactly the kind of experienced take I was hoping my blog post would summon =D Re: computing M and s, does torch.quantization.quantize_qat do this or do you do it yourself from the (presumably f32) activation scaling that torch finds? I don't have much experience with this kind of numerical computing, so I have no intuition about how much the "quantization" of selecting M and s might impact the overall performance of the network. I.e., whether - M and s should be trained as part of QAT (e.g., the "Learned Step Size Quantization" paper) - it's fine to just deterministically compute M and s from the f32 activation scaling. Also: Thanks for the tips re: CMSIS-NN, glad to know it's possible to use in a non-framework way. Any chance your example is open source somewhere? |
|