Hacker News new | ask | show | jobs
by aspenmartin 111 days ago
post mortems / bug hunting -- pinpointing what part of the logic was to blame for a certain problem.
2 comments

this is what granular commits are for, the kilobytes long log of claude running in circles over bullshit isn't going to help anyone
I think the parent comment is saying “why did the agent produce this big, and why wants it caught”, which is a separate problem from what granular commits solve, of finding the bug in the first place.
There is no "why." It will give reasons but they are bullshit too. Even with the prompt you may not get it to produce the bug more than once.

If you sell a coding agent, it makes sense to capture all that stuff because you have (hopefully) test harnesses where you can statistically tease out what prompt changes caused bugs. Most projects wont have those and anyway you don't control the whole context if you are using one of the popular CLIs.

If I have a session history or histories, I can (and have!) mine them to pinpoint where an agent either did not implement what it was supposed to, or understand who asked for a certain feature an why, etc. It complements commits, sessions are more like a court transcript of what was said / claimed (session) and then you can compare that to what was actually done (commits).
Some of my sessions are over 1GB at this point. I just don't think this scales usefully or meaningfully. Those things should live as summarized artifacts within issue tracking IMHO
Then look at the code, the session will only confuse. To read an LLM's explanation is to anthropomorphize what will just be a probabilistic incident.
no you look at the session to understand what the context was for the code change -- what did you _ask_ the llm to do? did it do it? where did a certain piece of logic go wrong? Session history has been immensely useful to me and it serves as an important documentation of the entire flow of the project. I don't think people should look at session histories at all unless they need to.