|
|
|
|
|
by CobaltFire
4 days ago
|
|
Given the hedges made in your statement here and the extremely questionable choice to trade a Q4 model with a less quantized cache for a Q6 with a Q3 cache I think this can safely be said to not fit the title. The Qwen3.6-35B model has, in my testing, been decent but not nearly as good as the Qwen3.6-27B. Running that with a less quantized cache is going to be "better" for anyone using it for software dev in my limited testing. |
|
I am now trying to sweet spot things with 27b model + tubo8 so I guess should have plenty quality context left.
the error I made in my tests is to stop with a working configuration that maximised my hardware use, and missing real deep software tests. The one shot 3D app I generated with previous setup is exactly telling this : I did not try my setup on real software development cases.
So thank you for guidance. I am not new using agentic code, but when it comes to proper setup with deep understanding of real trades off on inferences engines, I need more deep undestanding to make better decisions.
The 27b Q6_K turbo 8 for ~150K context should give me a real improvement on this stack. It's test party time :D
Edit : oops, I also found I cannot edit anymore my bold wrong title :/