| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by moffkalast 840 days ago
	I've tested with the Mixtral on LMSYS direct chat, gen params may vary a bit of course. In my experience running it locally it's been a lot more finicky to get it to work consistently compared to non-MoE models so I don't really keep it around anymore. 3.5-turbo's coding abilities are not that great, specialist 7B models like codeninja and deepseek coder match and sometimes outperform it.

1 comments

emporas 840 days ago

There is also Mistral-next, which they claim that it has advanced reasoning abilities, better than ChatGPT-turbo. I want to use it at some point to test it. Have you tried Mistral-next? Is it no good?

You were talking about reasoning and i replied about coding, but coding requires some minimal level of reasoning. In my experience using both models to code, ChatGPT-turbo and Mixtral are both great.

>3.5-turbo's coding abilities are not that great, specialist 7B models like codeninja and deepseek coder match and sometimes outperform it.

Nice, i will keep these two in mind to use them.

moffkalast 840 days ago

I've tried Next on Lmsys and Le Chat, honestly I don't think it's much different than Small, and overall kinda meh I guess? Haven't really thrown any code at it though.

They say it's more "concise" whatever that's supposed to mean, I haven't noticed it being any more succinct than the others.