|
I quite like the GPT models when chatting with them (in fact, they're probably my favorites), but for agentic work I only had bad experiences with them. They're incredibly slow (via official API or openrouter), but most of all they seem not to understand the instructions that I give them. I'm sure I'm _holding them wrong_, in the sense that I'm not tailoring my prompt for them, but most other models don't have problem with the exact same prompt. Does anybody else have a similar experience? |
That's really the story of my life. Trying to find a smart model with low latency.
Qwen 3.5 9b is almost smart enough and I assume I can run it on a 5090 with very low latency. Almost. So I am thinking I will fine tune it for my application a little.