| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dkarras 891 days ago
	no it's mistral. mistral 7b and mixtral 8x7b MoE which is almost on par (or better than) chatgpt 3.5. Mistral 7b itself packs a punch as well.

1 comments

mark_l_watson 891 days ago

Mixtral 8x7b continues to amaze me, even though I have to run it with 3 bit quantization on my Mac (I just have 32G memory). When I run this model on commercial services with 4 or more bits of quantization I definitely notice, subjectively, better results.

I like to play around with smaller models and regular app code in Common Lisp or Racket, and Mistral 7b is very good for that. Mixing and matching old fashioned coding with the NLP, limited world knowledge, and data manipulation capabilities of LLMs.

link

dkarras 890 days ago

There is also MiQu (stands for mi(s|x)tral quantized I think?) which is a leaked and older mistral medium model. I have not been able to try it as it needs some RAM / VRAM I don't have but people say it is very good.

link

throwawaybbq1 890 days ago

This is neat to know. On Ollama, I see mistral and mixtral. Is the latter one the MoE model?

link

dkarras 890 days ago

yes, mixtral is the MoE model.

link