|
|
|
|
|
by lelandbatey
21 days ago
|
|
A gaming PC can already host models that perfectly serve casual users who just want recipes, todo tracking, picture identification, etc. E.g. Qwen 3.6 35b which will run on a $650 GPU at 75 t/s (Nvidia 1660 ti 16GB). Said model will also run as a tool-calling coding model excellently (it's no Opus, but for a thing that once set up is just the cost of energy, it's incredible). It can type faster than you can, probably 10x faster, so with guidance it'll make you faster. And it's free. It's here. If folks want ChatGPT without a subscription, they can have it today on their computer. The only money to be made is in the high end models doing "serious business" work spanning 1M+ token contexts and massive uncertainty. Everything else is already set to be eaten by today's local models. |
|
Here's a prompt I just ran against Claude Opus 4.7:
> Use python3 to experiment with whether the SQLite3 authorizer mechanism can be used to detect an INSERT OR REPLACE based just on running an explain query without examining the SQL string itself
Opus nailed it: https://claude.ai/share/c4212606-3fee-4b7c-bc97-505e0348ccac
I tried the same thing against qwen/qwen3.5-35b-a3b running locally in lmstudio, with the Pi coding agent. At first it looked like it was going to do great! And then it fell apart over the course of several tool calls: https://gisthost.github.io/?8ae2f842df619fb7fd8f1ccd82fe41c7
I'm used to GPT-5.5 and Opus 4.7 handling that kind of prompt without any problems at all.