Hacker News new | ask | show | jobs
by bitdeep 912 days ago
You missed ollama option
2 comments

I was hoping I could run my LLM CLI tool against Ollama via their localhost API, but it looks like they don't offer an OpenAI-compatible endpoint yet.

If they add that it will work out of the box: https://llm.datasette.io/en/stable/other-models.html#openai-...

Otherwise someone would need to write a plugin for it, which would probably be pretty simple - I imagine it would look a bit like the llm-mistral plugin but adapted for the Ollama API design: https://github.com/simonw/llm-mistral/blob/main/llm_mistral....

Which honestly is the easiest option of them all if you own an Apple Silicon based Mac. You just download the ollama and then run `ollama run mixtral` (or choose a quantization from their models page if you don't have enough ram to run the defalt q4 model) and that's it.
I tried an hour ago and had a can't load model error. Everything up to date. Is there any special step?
Tried `ollama pull mixtral` just now and it seems to be working, albeit pretty slowly.
How much RAM do you have? Mixtral is a beast and the non quantized model wants 40GB+ of memory.
Ah, that might be it! I have only 32
The q2 should fit.