Y
Hacker News
new
|
ask
|
show
|
jobs
by
ru552
791 days ago
easiest is probably with ollama [0]. I think the ollama API is OpenAI compatible.
[0]
https://ollama.com/
2 comments
talldayo
791 days ago
Most inference servers are OpenAI-compatibile. Even the "official" llama-cpp server should work fine:
https://github.com/ggerganov/llama.cpp/blob/master/examples/...
link
pants2
791 days ago
Ollama runs locally. What's the best option for calling the new Mixtral model on someone else's server programmatically?
link
Arcuru
791 days ago
Openrouter lists several options:
https://openrouter.ai/models/mistralai/mixtral-8x22b
link