Hacker News new | ask | show | jobs
by CamperBob2 113 days ago
Try the 27B dense model. It will likely do much better than the 35b MoE with only 3B active experts.

Also, performance on research-y questions isn't always a good indicator of how the model will do for code generation or agent orchestration.

1 comments

Currently sat waiting for the unsloth fixed quants to drop, but I'm on the edge of my seat for this.
Wait, didn't they drop like two days ago?
The 35b did but not the 27b. Looks like the latter has been updated in the last half hour.
Neat! Thanks for correcting me there. I'll go and take a look.