I did something similar in my hobby project - the agent was promoted, among the other things, to copy a signature from a build artifact into a json file. It worked fine until it didn’t - one day Claude 4 Sonnet randomly flipped one letter in the signature to something else. It wasn’t the end of the world, I catched the error because I always manually test if the release worked, but it shows that AI tools should not be used as execution engines for CI/CD workflows. It’s slow, inefficient and error prone. Just ask the AI to help you write a proper workflow with code.
Thanks for pointing out something which by some is considered unpopular. Use AI tools all you want but wherever you want deterministic outcome - current generation isn't up to that level.
We must acknowledge, understand and work around a technology's limitations.
What is the deterministic alternative you suggest?
I’m not endorsing this release practice in particular, it scares me. But I have been involved in a lot of automation projects where perfection was the initial goal, and then abandoned because it was obvious that non-automated work was so imperfect. Human error is a fact of life.
If you really have to use an IA, at least use it to generate code once and use that. This way it's deterministic and you get a chance to understand what happens and to debug issues.
Not sure why IA could create something you couldn't however. And at least understanding what happens if part of the bundle.
Did it? I didn’t see a claim that doing this work manually had a zero error rate.
Again, I would probably not do this. But let’s not pretend that non-AI release processes prevent all issues. We’re really talking different kinds of errors, and the ai driven ones tend to be obviously wrong. At least right now.