Hacker News new | ask | show | jobs
by newfocogi 607 days ago
TLDR: Quantized versions of Llama 3.2 1B and 3B models with "competitive accuracy" to the original versions (meaning some degraded performance; plots included in the release notes).
2 comments

Quantization schemes include post-training quantization (PTQ), SpinQuant, and QLoRA.
Thx, I prefer not to visit meta properties :X

They were already pretty small but I guess the smaller the better as long as accuracy doesn't suffer too much.