|
|
|
|
|
by rahimnathwani
508 days ago
|
|
IIRC it makes things a little easier, e.g. you don't need to specify a ClI flag to set how many layers to offload to GPU, and it provides an API that other programs on your system can use (e.g. openwebui). It's been a while since I used llama.cpp directly, and I don't know whether I'm correct about its current scope. |
|