Y
Hacker News
new
|
ask
|
show
|
jobs
by
omneity
317 days ago
It takes ~17-20GB on Q4 depending on context length & settings (running it as we speak)
~30GB in Q8 sure, but it's a minimal gain for double the VRAM usage.