|
|
|
|
|
by simonw
21 days ago
|
|
The problem with models like Qwen 3.6 35B (which really is an excellent model) is that my expectations of what a model can do have gone SO high now. Here's a prompt I just ran against Claude Opus 4.7: > Use python3 to experiment with whether the SQLite3 authorizer mechanism can be used to detect an INSERT OR REPLACE based just on running an explain query without examining the SQL string itself Opus nailed it: https://claude.ai/share/c4212606-3fee-4b7c-bc97-505e0348ccac I tried the same thing against qwen/qwen3.5-35b-a3b running locally in lmstudio, with the Pi coding agent. At first it looked like it was going to do great! And then it fell apart over the course of several tool calls: https://gisthost.github.io/?8ae2f842df619fb7fd8f1ccd82fe41c7 I'm used to GPT-5.5 and Opus 4.7 handling that kind of prompt without any problems at all. |
|