Hacker News new | ask | show | jobs
by dkarras 844 days ago
no it's mistral. mistral 7b and mixtral 8x7b MoE which is almost on par (or better than) chatgpt 3.5. Mistral 7b itself packs a punch as well.
1 comments

Mixtral 8x7b continues to amaze me, even though I have to run it with 3 bit quantization on my Mac (I just have 32G memory). When I run this model on commercial services with 4 or more bits of quantization I definitely notice, subjectively, better results.

I like to play around with smaller models and regular app code in Common Lisp or Racket, and Mistral 7b is very good for that. Mixing and matching old fashioned coding with the NLP, limited world knowledge, and data manipulation capabilities of LLMs.

There is also MiQu (stands for mi(s|x)tral quantized I think?) which is a leaked and older mistral medium model. I have not been able to try it as it needs some RAM / VRAM I don't have but people say it is very good.
This is neat to know. On Ollama, I see mistral and mixtral. Is the latter one the MoE model?
yes, mixtral is the MoE model.