|
|
|
|
|
by sjanes
3 days ago
|
|
I've kind of given up on the routers for "free" inference, as you would expect, they tend to give you sub-par thinking because they are obviously trying to conserve as much inference as possible. I've had some success turning my macbook M1 pro into a heating pad with Qwen 3.6 35B A3B MTP. Trying to use Gemini models "locally" resulted in a similar "short shrift" of effort resulting in mistakes and lots of turns. The reports of Fable being relentlessly "proactive" shows you can go the other direction as well, if you have strong enough branding and effective invoicing. |
|
Xiaomi MiMo ($6/mo: https://platform.xiaomimimo.com/token-plan) & Alibaba Qwen ($50/mo: https://www.alibabacloud.com/en/campaign/ai-scene-coding) have generous limits on fixed subscriptions.