Hacker News new | ask | show | jobs
by dexterlagan 53 days ago
Many of us tested 27B and 35B side by side, and the dense model is significantly smarter. It indeed is slower, but 35B makes a lot of mistakes 27B doesn't.