|
|
|
|
|
by breakpointalpha
262 days ago
|
|
Your millage may vary, but I just got Cursor (using Claude 4 Sonnet) to one shot a sequence of bash scripts that cleanup stale AWS resources. I pasted the Jira ticket description that I wrote, with a few examples and the script works perfectly. Saved me a few hours of bash writing and debugging because I can read bash, but not write it well. It seems that the smaller the task and the more tightly defined the input and output, the better the LLMs are at one-shotting. |
|
I’ve also had experiences where I started out well but the AI got confused, hallucinated, or otherwise got stuck. At least for me those cases have turned pathological because it always _feels_ like just one or two more tweaks to the prompt, a little cleanup, and you’ll be done, but you can end up far down that path before you realize that you need to step back and either write the thing yourself or, at the very least, be methodical enough with the AI that you can get it to help you debug the issue.
The latter case happens maybe 20% of the time for me, but the cost is high enough that it erases most of the time savings I’ve seen in the happy path scenario.
It’s theoretically easy to avoid by just being more thoughtful and active as a reviewer, but that reduces the efficiency gain in the happy path. More importantly, I think it’s hard to do for the same reason partially self driving cars are dangerous: humans are bad at paying attention well in “mostly safe and boring, occasionally disastrous” type settings.
My guess is that in the end we’ll see less of the problematic cases. In part because AI improves, and in part because we’ll develop better intuition for when we’ve stepped onto the unproductive path. I think a lot of it too will also be that we adopt ways of working that minimize the pathological “lost all day to weird LLM issues” problems by trying to keep humans in the loop more deeply engaged. That will necessarily also reduce the maximum size of the wins we get, but we’ll come away with a net positive gain in productivity.