| HN Mirror

#!/bin/sh export OLLAMA_MODELS="/mnt/ai-models/ollama/" printf 'Starting the server now.\n' ollama serve >/dev/null 2>&1 & serverPid="$!" printf 'Starting the client (might take a moment (~3min) after a fresh boot).\n' ollama run llama3.2 2>/dev/null printf 'Stopping the server now.\n' kill "$serverPid"

this was pretty much spot-on to my experience and track. the ridicule of people choosing to use ollama over llamacpp is so tired.

i had already burned an evening trying to debug and fix issues getting nowhere fast, until i pulled ollama and had it working with just two commands. it was a shock. (granted, there is/was a crippling performance problem with sky/kabylake chips but mitigated if you had any kind of mid-tier GPU and tweaked a couple settings)

anyone who tries to contribute to the general knowledge base of deploying llamacpp (like TFA) is doing heaven's work.