Y
Hacker News
new
|
ask
|
show
|
jobs
by
oktoberpaard
431 days ago
With a 128K context length and 8 bit KV cache, the 27b model occupies 22 GiB on my system. With a smaller context length you should be able to fit it on a 16 GiB GPU.