Hacker News new | ask | show | jobs
by Dkuku 873 days ago
In llama.cpp You can offload some of the layers to gpu with -ngl X. Where x is the number of layers