Modern quantization schemes are almost like lossy compression algorithms, and llms in particular are very "sparse" and amenable to compression.