| HN Mirror

weight quantization is basically a short list of shortened values used as an index for a lookup table that represents the desired full values

if you have a 24bit value.. say, a 24bit color, that means you have ~16million.. 2^24==16777216.. possible colors

but if you only want to use 200 colors you can, instead of representing them as the full 24bit value, use an 8bit value.. 2^8==256>200.. and have those 8bits represent a value in an index that points to the desired full 24bit value

so you have to ask yourself.. what parameters of my neural net can be represented as an index? or, what parameters are of a quantity less than the parameter values' size?

wiki defines ann parameters as:

An ANN is typically defined by three types of parameters:

    The connection pattern between the different layers of neurons
    The weights of the connections, which are updated in the learning process.
    The activation function that converts a neuron's weighted input to its output activation.

here is a great paper that tries to answer this question for you in a way that highlights error resulting from quantization decisions(i)

(o) https://en.wikipedia.org/wiki/Artificial_neural_network#Netw...

(i) https://www.cmpe.boun.edu.tr/~ethem/files/papers/fatih_icann...