Hacker News new | ask | show | jobs
by arrowsmith 1018 days ago
Sorry, what does quantization mean here?
3 comments

Reducing the precision of the weights from high precision floating points to either lower precision floats or even integers. You'd think it would greatly reduce the performance of a model, but in most cases the decline in quality is extremely tolerable compared to the reduction in memory/processing requirements.
It means using less number of bits to store float values. This reduces the memory/compute requirement at the cost of making model less precise.
Reducing the precision of the parameters — result being less memory intensive