Hacker News new | ask | show | jobs
by CobaltFire 4 days ago
Given the hedges made in your statement here and the extremely questionable choice to trade a Q4 model with a less quantized cache for a Q6 with a Q3 cache I think this can safely be said to not fit the title.

The Qwen3.6-35B model has, in my testing, been decent but not nearly as good as the Qwen3.6-27B. Running that with a less quantized cache is going to be "better" for anyone using it for software dev in my limited testing.

1 comments

you are absolutely right. my title is very bad. I'll update it to a very less absolute statement. sorry for that.

I am now trying to sweet spot things with 27b model + tubo8 so I guess should have plenty quality context left.

the error I made in my tests is to stop with a working configuration that maximised my hardware use, and missing real deep software tests. The one shot 3D app I generated with previous setup is exactly telling this : I did not try my setup on real software development cases.

So thank you for guidance. I am not new using agentic code, but when it comes to proper setup with deep understanding of real trades off on inferences engines, I need more deep undestanding to make better decisions.

The 27b Q6_K turbo 8 for ~150K context should give me a real improvement on this stack. It's test party time :D

Edit : oops, I also found I cannot edit anymore my bold wrong title :/