Y
Hacker News
new
|
ask
|
show
|
jobs
by
throwdbaaway
113 days ago
For 27B, just get a used 3090 and hop on to r/LocalLLaMA. You can run a 4bpw quant at full context with Q8 KV cache.