Hacker News new | ask | show | jobs
by ltbarcly3 473 days ago
The Q8 model is totally different?
1 comments

My experience with quantizations is that anything below 6 is noticeably worse. Coherence suffers. I’ve rarely gotten anything really useful out of a Q4 model, code wise. For transformations they are great though, eg convert JSON to Markdown and vice versa.
No I mean the quantized versions of this model in particular have less parameters as well. They are almost different models.
I like Q5

The sweet spot for me