Hacker News new | ask | show | jobs
by Tepix 265 days ago
Looks like Ollama is focusing more and more on non-local offerings. Also their performance is worse than say vLLM.

What's a good Ollama alternative (for keeping 1-5x RTX 3090 busy) if you want to run things like open-webui (via an OpenAI compatible API) where your users can choose between a few LLMs?

2 comments

At work I've set up LibreChat + LlamaSwap + llama.cpp

200 weekly users :)

How do you deal with different users wanting to use different LLMs at the same time?
i heard about Llamaswap and vllm