| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by redrove 84 days ago
	There is virtually no reason to use Ollama over LM Studio or the myriad of other alternatives. Ollama is slower and they started out as a shameless llama.cpp ripoff without giving credit and now they “ported” it to Go which means they’re just vibe code translating llama.cpp, bugs included.

8 comments

logicallee 83 days ago

>Ollama is slower

I've benchmarked this on an actual Mac Mini M4 with 24 GB of RAM, and averaged 24.4 t/s on Ollama and 19.45 t/s on LM Studio for the same ~10 GB model (gemma4:e4b), a difference which was repeated across three runs and with both models warmed up beforehand. Unless there is an error in my methodology, which is easy to repeat[1], it means Ollama is a full 25% faster. That's an enormous difference. Try it for yourself before making such claims.

[1] script at: https://pastebin.com/EwcRqLUm but it warms up both and keeps them in memory, so you'll want to close almost all other applications first. Install both ollama and LM Studio and download the models, change the path to where you installed the model. Interestingly I had to go through 3 different AI's to write this script: ChatGPT (on which I'm a Pro subscriber) thought about doing so then returned nothing (shenanigans since I was benchmarking a competitor?), I had run out of my weekly session limit on Pro Max 20x credits on Claude (wonder why I need a local coding agent!) and then Google rose to the challenge and wrote the benchmark for me. I didn't try writing a benchmark like this locally, I'll try that next and report back.

dminik 83 days ago

It depends on the hardware, backend and options. I've recently tried running some local AIs (Qwen3.5 9B for the numbers here) on an older AMD 8GB VRAM GPU (so vulkan) and found that:

llama.cpp is about 10% faster than LM studio with the same options.

LM studio is 3x faster than ollama with the same options (~13t/s vs ~38t/s), but messes up tool calls.

Ollama ended up slowest on the 9B, Queen3.5 35B and some random other 8B model.

Note that this isn't some rigorous study or performance benchmarking. I just found ollama unnaceptably slow and wanted to try out the other options.

alifeinbinary 84 days ago

I really like LM Studio when I can use it under Windows but for people like me with Intel Macs + AMD gpu ollama is the only option because it can leverage the gpu using MoltenVK aka Vulkan, unofficially. We're still testing it, hoping to get the Vulkan support in the main branch soon. It works perfectly for single GPUs but some edge cases when using multiple GPUs are unsupported until upstream support from MoltenVK comes through. But yeah, I agree, it wasn't cool to repackage Georgi's work like that.

gen6acd60af 84 days ago

LM Studio is closed source.

And didn't Ollama independently ship a vision pipeline for some multimodal models months before llama.cpp supported it?

zozbot234 83 days ago

Yes, they introduced that Golang rewrite precisely to support the visual pipeline and other things that weren't in llama.cpp at the time. But then llama.cpp usually catches up and Ollama is just left stranded with something that's not fully competitive. Right now it seems to have messed up mmap support which stops it from properly streaming model weights from storage when doing inference on CPU with limited RAM, even as faster PCIe 5.0 SSDs are finally making this more practical.

The project is just a bit underwhelming overall, it would be way better if they just focused on polishing good UX and fine-tuning, starting from a reasonably up-to-date version of what llama.cpp provides already.

iLoveOncall 84 days ago

> There is virtually no reason to use Ollama over LM Studio or the myriad of other alternatives.

Hmm, the fact that Ollama is open-source, can run in Docker, etc.?

DiabloD3 84 days ago

Ollama is quasi-open source.

In some places in the source code they claim sole ownership of the code, when it is highly derivative of that in llama.cpp (having started its life as a llama.cpp frontend). They keep it the same license, however, MIT.

There is no reason to use Ollama as an alternative to llama.cpp, just use the real thing instead.

simondotau 83 days ago

If it’s MIT code derived from MIT code, in what way is its openness ”quasi”? Issues of attribution and crediting diminish the karma of the derived project, but I don’t see how it diminishes the level of openness.

DiabloD3 83 days ago

FOSS licensing can only exist in terms of Copyright. Without Copyright, you cannot license FOSS. If something has an incorrect Copyright attribution, then the license can be viewed as invalid until this deficiency has been corrected (obv. depending on local laws, etc).

On top of this, it would not be unreasonable for the numerous authors of llama.cpp to issue DMCA takedown requests if Ollama is unwilling to correct it.

jrm4 83 days ago

Do y'all mean backend or the Ollama frontend or both? I find it trivially easy to sub in my local Ollama api thing in virtually all of the interesting frontend things. I'm quite curious about the "why not Ollama" here.

faitswulff 84 days ago

Does LM Studio have an equivalent to the ollama launch command? i.e. `ollama launch claude --model qwen3.5:35b-a3b-coding-nvfp4`

DiabloD3 84 days ago

I don't think it does, but llama.cpp does, and can load models off HuggingFace directly (so, not limited to ollama's unofficial model mirror like ollama is).

There is no reason to ever use ollama.

ffsm8 84 days ago

> I don't think it does, but llama.cpp does

I just checked their docs and can't see anything like it.

Did you mistake the command to just download and load the model?

DiabloD3 83 days ago

As a sibling comment answered you, it is `-hf`.

And yes, it downloads the model, caches it, and then serves future loads of that model out of the cache if the file hasn't changed in the hf repo.

ffsm8 83 days ago

So I'm summary: no, it does not have an equivalent command either.

u8080 83 days ago

-hf ModelName:Q4_K_M

ffsm8 83 days ago

Did you mistake the command to just download and load the model too?

Actually that shouldn't be a question, you clearly did.

Hint: it also opens Claude code configured to use that model

beanjuiceII 84 days ago

sure there's a reason...it works fine thats the reason

meltyness 84 days ago

I feel like the READMEs for these 3 large popular packages already illustrate tradeoffs better than hacker news argument

lousken 84 days ago

lm studio is not opensource and you can't use it on the server and connect clients to it?

jedisct1 84 days ago

LM Studio can absolutely run as as server.

walthamstow 84 days ago

IIRC it does so as default too. I have loads of stuff pointing at LM Studio on localhost