Hacker News new | ask | show | jobs
by ziml77 485 days ago
I've found they will also introduce subtle changes. I just used o1 recently to pull code from a Python notebook and remove all the intermediate output. It basically got it right except for one string that was used to look up info from an external source. It just dropped 2 characters from the end of the string. That issue required a bit of time to track down because I thought it was an issue with the test environment!

Eventually I ended up looking at the notebook and the extracted code side-by-side and carefully checking every line. Despite being split across dozens of cells, it would have been faster if I had started out by just manually copying the code out of each meaningful cell and pasted it all together.

1 comments

I have seen LLMs do this. This is why I try to create a git repo, even if it is just local and then diff my changes. This allows me to catch LLM changing something that was completely unrelated to the task it was working on.