Y
Hacker News
new
|
ask
|
show
|
jobs
by
rao-v
24 days ago
I'd have to try the KV cache trick but folks get pretty competitive speeds with the current 31B/27B dense models e.g.
https://www.reddit.com/r/LocalLLaMA/comments/1tc9j6u/mi50s_q...