Y
Hacker News
new
|
ask
|
show
|
jobs
by
irusensei
876 days ago
Why not both? Llama.cpp allows layering GGUF models between GPU and CPU memory.