Hacker News new | ask | show | jobs
by throwdbaaway 113 days ago
For 27B, just get a used 3090 and hop on to r/LocalLLaMA. You can run a 4bpw quant at full context with Q8 KV cache.