|
|
|
|
|
by DrPhish
327 days ago
|
|
I generally download the safetensors and make my own GGUFs, usually at Q8_0.
Is there any measurable benefit to your dynamic quants at that quant level?
I looked at your dynamic quant 2.0 page, but all the charts and graphs appear to cut off at Q4. |
|
Oh the blog at https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs does talk about 1, 2, 3, 4, 5, 6 and 8bit dynamic GGUFs as well!
There definitely is a benefit for dynamically selecting layers to be at diff bit rates - I wrote about the difference between naively quantizing and selectively quantizing: https://unsloth.ai/blog/deepseekr1-dynamic