Hacker News new | ask | show | jobs
by verdverm 29 days ago
Since I started using Qwen-3.6 35B A3B, I believe frontier like capability will be more than enough in these smaller models within a year or two, at least for coding. They don't need to memorize facts into their weights, which likely has very interesting implications that I'm not going speculatively decode