|
|
|
|
|
by arendtio
565 days ago
|
|
I tried building and using llama.cpp multiple times, and after a while, I got so frustrated with the frequently broken build process that I switched to ollama with the following script: #!/bin/sh
export OLLAMA_MODELS="/mnt/ai-models/ollama/"
printf 'Starting the server now.\n'
ollama serve >/dev/null 2>&1 &
serverPid="$!"
printf 'Starting the client (might take a moment (~3min) after a fresh boot).\n'
ollama run llama3.2 2>/dev/null
printf 'Stopping the server now.\n'
kill "$serverPid"
And it just works :-) |
|
i had already burned an evening trying to debug and fix issues getting nowhere fast, until i pulled ollama and had it working with just two commands. it was a shock. (granted, there is/was a crippling performance problem with sky/kabylake chips but mitigated if you had any kind of mid-tier GPU and tweaked a couple settings)
anyone who tries to contribute to the general knowledge base of deploying llamacpp (like TFA) is doing heaven's work.