Y
Hacker News
new
|
ask
|
show
|
jobs
by
M4v3R
929 days ago
Try different quantization variations. I got vastly different speeds depending on which quantization I chose. I believe q4_0 worked very well for me. Although for a 7B model q8_0 runs just fine too with better quality.