Y
Hacker News
new
|
ask
|
show
|
jobs
by
halJordan
69 days ago
Quantization is an extraordinarily trivial process. Especially if you're doing it with llama.cpp (which unsloth obviously does).
Qwen did release an fp8 version, which is a quantized version.