Hacker News new | ask | show | jobs
by jerpint 814 days ago
Any news on how this model will compare to Mixtral? Interesting that they aren’t releasing a model with MoE this time given the success mixtral had
1 comments

Not yet, but I'm sure they will release some benchmarks soon. As for it not being an MoE model, there's still a ton of value in having a small non-MoE model for many usecases, and improvements that get discovered to train the small model can potentially improve the next version of the MoE model down the line.