| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by KronisLV 102 days ago

I had an annoying issue in a setup with two Nvidia L4 cards where trying to run the MoE versions to get decent performance just didn't work with Ollama, seems the same as these:

https://github.com/ollama/ollama/issues/14419

https://github.com/ollama/ollama/issues/14503

So for now I'm back to Qwen 3 30B A3B, kind of a bummer, because the previous model is pretty fast but kinda dumb, even for simple tasks like on-prem code review!