Hacker News new | ask | show | jobs
by kkielhofner 996 days ago
The “original” and by far most common format for quantization is GPTQ.

AWQ support is spreading more, which is nice.

1 comments

Again, for a subset of the local LLM community. Quantization was not invented on Github, by llama.cpp, for LLMs in 2023.
If a tree falls in a forrest and no one is around, does it make a sound?

Of course quantization was invented well before LLMs. However, LLMs have dramatically accelerated development on quantization and resulted in an explosion in use.