Ollama locally is very slow (or low quality). I feel a good middle ground is renting GPU or TPU per minute and running a local model there.