Hacker News new | ask | show | jobs
by oynqr 507 days ago
Just do it? Llama.cpp doesn't load the entire thing into ram. It mmaps the file and the kernel takes care of the rest.