Y
Hacker News
new
|
ask
|
show
|
jobs
by
joefourier
32 days ago
What quant? You should have no problem running it at Q4 with 256K context, Q5 or Q6 even although maybe not at full context. I can run Q4 on a 4090 with just 24GB VRAM.