|
|
|
|
|
by jzapletal
110 days ago
|
|
I did this while trying to figure out what to use in our own tool. The task was to analyze around 12,000 screenshots and find recurring manual workflows worth automating. Results: - Claude Sonnet 4.6: 8/10, $0.53/run — wins on quality - Kimi K2.5: 7/10, $0.09/run — 6x cheaper, now my production pick - GPT-5.2: 6/10, $0.41/run — missed the most obvious patterns, odd - DeepSeek V3.2: 0/10 — gave me a garbled XML... Models that flagged a one-time DKIM setup as "recurring automation candidate" got penalized. Happy to share more if folks find this interesting. |
|