| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yencabulator 502 days ago
	And now you need a server per model? Ollama loads models on-demand, and terminates them after idle, all accessible over the same HTTP API.