Hacker News new | ask | show | jobs
by lynaghk 404 days ago
Awesome, thanks! This is exactly the kind of experienced take I was hoping my blog post would summon =D

Re: computing M and s, does torch.quantization.quantize_qat do this or do you do it yourself from the (presumably f32) activation scaling that torch finds?

I don't have much experience with this kind of numerical computing, so I have no intuition about how much the "quantization" of selecting M and s might impact the overall performance of the network. I.e., whether

- M and s should be trained as part of QAT (e.g., the "Learned Step Size Quantization" paper)

- it's fine to just deterministically compute M and s from the f32 activation scaling.

Also: Thanks for the tips re: CMSIS-NN, glad to know it's possible to use in a non-framework way. Any chance your example is open source somewhere?