| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mark_l_watson 850 days ago
	Mixtral 8x7b continues to amaze me, even though I have to run it with 3 bit quantization on my Mac (I just have 32G memory). When I run this model on commercial services with 4 or more bits of quantization I definitely notice, subjectively, better results. I like to play around with smaller models and regular app code in Common Lisp or Racket, and Mistral 7b is very good for that. Mixing and matching old fashioned coding with the NLP, limited world knowledge, and data manipulation capabilities of LLMs.

2 comments

dkarras 849 days ago

There is also MiQu (stands for mi(s|x)tral quantized I think?) which is a leaked and older mistral medium model. I have not been able to try it as it needs some RAM / VRAM I don't have but people say it is very good.

link

throwawaybbq1 849 days ago

This is neat to know. On Ollama, I see mistral and mixtral. Is the latter one the MoE model?

link

dkarras 849 days ago

yes, mixtral is the MoE model.

link