Hacker News new | ask | show | jobs
by CamperBob2 490 days ago
Have you compared it to the 1.58 bit dynamic quant model based on the original R1 (i.e., not a distillation)? Whatever unsloth did, it doesn't seem to be giving up much reasoning performance over the full Q8 version.
1 comments

It's simply bc the model is small (1.5B), making it sensitive to weight perturbations