Hacker News new | ask | show | jobs
by espadrine 930 days ago
Once on llama.cpp, it will likely run on CPU with enough RAM, especially given that the GGUF mmap code only seems to use RAM for the parts of the weights that get used.