Hacker News new | ask | show | jobs
by MallocVoidstar 872 days ago
That isn't unquantized, it's de-quantized. They went from Q5 to fp16 for use in Pytorch instead of the GGUF ecosystem.
2 comments

I never thought people would be upscaling models by increasing quantization precision. The rationale makes sense bit its also a goofy outcome.
You should be able to upscale and fine tune to recover performance, I suppose!

Clearly we should train a diffusion model to denoise the weights of LLM transformer models. Yo dawg.

Yes, that’s correct. Good correction.