Hacker News new | ask | show | jobs
by Y_Y 508 days ago
So it's a replacement for Ollama?

The killer features of Ollama for me right now are the nice library of quantized models and the ability to automatically start and stop serving models in response to incoming requests and timeouts. The first send to be solved by reusing the Ollama models, but I can't see if the service is possible from my cursory look.

1 comments

ramalama can just pull (almost) any arbitrary model off huggingface and run it ... you're not limited to just what ollama has repackaged into their non-standard format
Ollama has the ability to pull models off Hugging Face as well:

https://huggingface.co/docs/hub/en/ollama