Hacker News new | ask | show | jobs
by arnaudsm 607 days ago
How do they compare to their original quants on ollama like q4_K_S?
1 comments

These undergo additional fine tuning (QLoRA) using some or all of the original dataset, so they're able to get the weights to align to the nf4 dtype better, which increases the accuracy.