Hacker News new | ask | show | jobs
by diogolsq 355 days ago
Code review has become the new bottleneck, since it’s the layer that prevents sloppy AI-generated code from entering the codebase.

One thing I do that helps clean things up before I send a PR is writing a summary. You might consider encouraging your peers to do the same.

## What Changed?

Functional Changes:

      - New service for importing data

      - New async job for dealing with z.
Non-functional Changes:

       - Refactoring of Class X

        - Removal of outdated code
It might not seem like much, but writing this summary forces you to read through all the changes and reflect. You often catch outdated comments, dead functions left after extractions, or other things that can be improved—before asking a colleague to review it.

It also makes the reviewer’s life easier, because even before they look at the code, they already know what to expect.

2 comments

Ha. Almost always when I see PRs with such summaries I can assume that both the summary and the code has been AI-generated.

PRs in general shouldn't require elaborate summaries. That's what commit messages are for. If the PR includes many commits where a summary might help, then that might be a sign that there should be multiple PRs.

1. about looking artificial.

Granted, it is not only summaries that go into the description—how to test, if there is any pre-deploy or post-deploy setup, any concerns, external documentation, etc.

Less is more. A summary serves to clarify, not to endlessly add useless information.

2. about the usefulness of summaries.

Summaries always provide better information—straight to the point—than commits (which are historical records). This applies to any type of information.

When you’re reporting a problem by going through historical facts, it can lead to multiple narratives, added complexity, and convoluted information.

Summaries that quickly deliver the key points clearly and focus only on what’s important offer a better way to communicate.

If the listener asks for details, they already have a clear idea of what to expect. A good summary is a good introduction to what you are going to see in the commits messages and in the code changes.

______________________

3.About multiple Prs.

Summary helps to clarify what is scope creep (be it a refactor or unrelated code to the ticket);

it make it easier for the reviewer demand a split in multiple PRs.

examples: A non-summary PR/MR might lead to the question—“WHY is this code here?"

"he touched a class here, was he fixing something that the test missed out ? or is just a refactor?"

_______________

As a reviewer you can get those information by yourself, although summary helps you to get it much quicker.

> A non-summary PR/MR might lead to the question—“WHY is this code here?"

This is precisely what a (good) commit message should answer.

Commits are historical records, sure, but they can include metadata about the change, which should primarily explain why the change was made, what tradeoffs were made and why, and any other pertinent information.

This is useful not just during the code review process, but for posterity whenever someone needs to understand the codebase, while bisecting, etc. If this information is only in the PR, it won't be easy to reference later.

FWIW I'm not against short summaries in PRs that are exceptionally tricky to understand. The PR description is also useful as a living document for keeping track of pending tasks. But in the majority of cases, commit messages should suffice. This is why GitHub, and I'm sure other forges as well, automatically fill out the PR title and description with the commit information, as long as there's only one commit, which is the ideal scenario. For larger PRs, if it doesn't make sense to create multiple, I usually just say "See the commits for details".

Depends on the business logic, sometimes summaries (or a short demo explanation) help a lot to understand the made tradeoffs, so the reviewer can contribute more without spending too much time. It is especially helpful if the part is somewhat isolated.

But most of the time it is not very necessary.

In theory this makes sense, but in practice now that LLMs are writing the PR summaries we just have even more slop to wade through to figure out exactly what the change is trying to achieve. I think the slide in this direction already started with exhaustive PR templates that required developers to fill in a whole bunch of fluff just to open their PR. The templates didn't make bad developers good, it just caused them to produce more bad content for review.

My experience with LLM-generated summaries is the same as it was with the templates: many complete them in a way that is entirely self-referential and lacking in context. I don't need a comment or a summary to describe to me exactly the same thing I could have already understood by reading the code. The reason for adding English-language annotations to source code is to explain how a particular change solves a complex business problem, or how it fits into a long-term architectural plan, that sort of thing. But the kinds of people who already did not care about that high level stuff don't have the context to write useful summaries, and LLMs don't either.

The worst thing I've seen recently is when you push for more clarity and context on the reasons behind a change, and then that request gets piped into an LLM. The AI subsequently invents a business problem or architectural goal that in reality doesn't exist and then you get a summary that looks plausible in the abstract, and may even support the code changes it is describing, but it still doesn't link back to anything the team or company is actually trying to achieve, and that costs the reviewer even more time to check. AI proponents might say "well they should have fed the team OKRs and company mission/vision/values into the LLM for context" but then that defeats the point of having the code review in the first place. If the output is performative and not instructive, then the whole process is a waste of time.

I am not sure what the solution is, although I do think that this is not a problem that started with LLMs, it's just an evolution of a challenge we have always faced - how to deal with colleagues who are not really engaged.