Y
Hacker News
new
|
ask
|
show
|
jobs
by
tannhaeuser
43 days ago
Tested gemma4 26 MoE 4bit quantisized gguf on llama.cpp following these guides with mmap'd I/O on a 16GB MBP and it was unbearably slow (0.0 t/s).