| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by refulgentis 881 days ago
	> Ollama is definitely the easiest way to run LLMs locally Nitro outstripped them, 3 MB executable with OpenAI HTTP server and persistent model load

2 comments

jmorgan 881 days ago

Persistent model loading will be possible with: https://github.com/ollama/ollama/pull/2146 – sorry it isn't yet! More to come on filesize and API improvements

link

akulbe 881 days ago

I just wanted to say thank you for being communicative and approachable and nice.

link

evantbyrne 881 days ago

Who cares about executable size when the models are measured in gigabytes lol. I would prefer a Go/Node/Python/etc server for a HTTP service even at 10x the size over some guy's bespoke c++ any day of the week. Also, measuring the size of an executable after zipping is a nonsense benchmark in of itself

link

refulgentis 881 days ago

Not some guy, agree on zip, disagree entirely with tone of the comment (what exactly separates ollama from those same exact hyperbolic descriptions?)

link