Hacker News new | ask | show | jobs
by firtoz 878 days ago
What are good API providers that serve mixtral? I know only octo ai which seems decent but will be good to know alternatives too
11 comments

The creators of the model actually have their own platform where you can access this model and others via API: https://console.mistral.ai/
I just discovered Groq, which does 485.08 T/s on mixtral 8x7B-32k

No idea on pricing but supposedly one can email to api@groq.com

I think you can try it online at chat.groq.com
(Groqster here). Yes, you can select Mixtral from the dropdown menu. If anyone has any questions about Groq let me know and I'll do my best to answer!
OpenRouter is generally a good option (already mentioned), the best part is that you have a unified API for all LLMs, and the pricing is the same as with the providers themselves. Although for OpenAI/Anthropic models they were forced (by the respective companies) to enable filtering for inputs/outputs.
Both already mentioned, but I am using Anyscale Endpoints with great success, very fast and will work on ten jobs at a go out of the box. Together.ai also seems to work fine in my initial tests, but haven't tried it at scale yet.
I have used both Mistral’s commercial APIs and also AnyScale’s commercial APIs for mixtral-8-7b- both providers are easy to use.

I also run a 3 bit quantization of mixtral-8-7b on my M2 Pro 32G memory system and it is fairly quick.

It is great having multiple options.

openrouter, fireworks, together.

we use openrouter but have had some inconsistency with speed. i hear fireworks is faster, swapping it out soon.

I work for Groq and we serve the fastest available version of Mixtral (by far) and we also have a web chat app. I'll refrain from linking it because it has already been linked and I don't want to spam, but I'm available to answer any questions people have about Groq's hardware and service.
Together.ai seems to be the best, incredibly fast.
Not so sure about that. Check out https://github.com/ray-project/llmperf-leaderboard

And try mixtral on chat.groq.com

These guys are much faster than openrouter, and their llama2 runs faster than 3.5-turbo. Amazing work.
I personally like Anyscale Endpoints
I've had good experiences with Together, and they have very competitive pricing.