Y
Hacker News
new
|
ask
|
show
|
jobs
by
regularfry
458 days ago
You always want a bit of headroom for context. It's a problem I keep bumping into with 32B models on a 24GB card: the decent quants fit, but the context you have available on the card isn't quite as much as I'd like.