|
|
|
|
|
by Der_Einzige
55 days ago
|
|
Claims of "we preserve 99.9999%" of accuracy are made in practically every quantization paper. The whole subfield acts like it's totally fine that they are testing on datasets that these models have fully trained on. If we were in any other subfield doing this would be considered cheating and get your paper rejected, but the quantization community really loves to spread FUD claiming that quantization doesn't harm models Also, similar dynamic with dense vs sparse MoE models. There's a reason we keep getting dense model releases alongside the MoEs out of China. Quantization is not free, causes significant brain damage (especially on very long contexts), and has enough academic misconduct within it that it's actively screwing up the market. Don't believe me? Go ask your local financial analyst about the markets reaction to TurboQuant and than try to square that circle with this: https://openreview.net/forum?id=tO3ASKZlok (extreme and credible allegations of academic misconduct/fraud) |
|
p.s. dense vs MoE: both are being released because they offer different trade-offs: at the same level of quality, MoE will use less compute, but more memory.