Hacker News new | ask | show | jobs
by dylukes 966 days ago
> In my book a general code review is simple sanity check by a second pair of eyes, which can result in suggestions to use different API or use an API slightly differently.

This is an impoverished view of code review. Code review is a principal mechanism for reducing individual code ownership, for propagating conventions, and for skill transfer.

A good code review starts with a good PR: one that outlines what its goals were and how it achieved them.

First item on a good review then: does the code achieve what it set out to do per the outline?

Second: does it contain tests that validate the claimed functionality?

Third: does it respect the architectural conventions of the system so far?

Fourth: is the style in line with expectations?

Fifth, and finally: do you have any suggestions as to better API usage?

A code review that is nothing more than a sanity check is useless and could have been done by CI infrastructure. Code review is a human process and should maximally take advantage of the things only humans could do. Leave the rest to machines.

An implicit question in several of the above is "will this set a good example for future contributions?"

8 comments

I go back and forth on this.

On the one hand, I really like constant deep feedback. I really like the consistency benefits of having another person say “that’s too much, I find that unreadable.”

On the other, I have now been at a lot of places where it was very hard to get my code reviewed, latencies of days and sometimes weeks if folks are in a particularly heinous crunchtime... And then when it does get reviewed, the stuff that I worked really hard on to get the right speed or to properly centralize cache invalidation... Suddenly someone is like “I would have done it from this other approach” and you have no idea whether it's tractable or not.

While I have never been at a place that did this, I have in my head the idea that the code should be an unfolding collective conversation, kind of like when folks are all collaborating on a shared Google Doc, I see that you are editing this section and I throw in a quick comment “don't forget to add XYZ” and then jump to a different part that I won't be stepping on their toes with. So the basic idea would be to get everybody to merge to `main` like 2 or 3 times a day if possible. In that case code review really is just “make sure this doesn't break the build or break prod, everything is behind a feature toggle, if I don't like the design I will comment on the code near the design or sketch a proof of concept showing that my approach is superior in clarity or speed or whatever”... Nobody ever takes me up on this culture shift it seems.

In this kind of context, I ask people what log level they'd like their review at. If you just want to get the code out the door, by all means, "error" or "warn" might be the right review depth, when you're confident in your code and don't want to be derailed with philosophy.

If you're exploring a new concept and want all the ideas and brainstorming you can get in your feedback, "debug" log level is appropriate.

Once that idea has moved down the pipe, you may be down to "info" or "warn" depending on how much conversation has happened around the PR.

This is a really useful analogy, thank you for sharing it.
> In this kind of context, I ask people what log level they'd like their review at.

It's a code review, not a QA session.

Isn’t QA the whole point of code review?
Code review is part of the QA but code review is code review. Tell me everything what you see, don't selectively skip things.
This is called Continuous Integration, and its a shame that the term got nicked to now mean "that thing that builds our PRs somewhere". The idea was that everyone pushes to main all the time, which basically reduces integration time to 0, as everyone is doing it every couple of minutes on big teams. After a while you learn how to not step on people's toes (Introduce new classes incrementally, use docblocs documenting class' intent, rather than just current functionality)

I too find that its basically impossible to suggest switching to this workflow, given the weight of all our existing tools. They are so easy to setup, and most come for free (Azure Devops/Pipelines thing) that going off the track is just unthinkable.

Everyone committing to the main branch all the time sounds like a nightmare to me, to be honest. I would probably seriously consider quitting if this was forced on me.

(Or rather, I would probably just run on my private git copy, and only pull every once in a while, and ignore that the main branch always changes.)

When / how do you do code review in your suggested workflow?

To be fair. I have never really heard of pushing to master as a CI technique for git. You would push to your own publicly visible branch and some script would automatically attempt to merge into a CI branch unless there's merge conflicts in which case you get notified. Then as you add things you get notified if the whole thing is breaking and get made aware of upcoming issues.
Yeah when I set up a repo, I set up two or three builds per PR:

- the code exactly as it is in that branch

- that branch merged into current main

- that branch merged into what is currently running in production (in the case that it's not what is in main e.g. tagged or branched)

This really helps to see whether it's an issue in "your" code, or whether it's a merge issue conflict causing failures.

Making those provisionally merge commits (and testing them) is a good idea, if you can spare the computational resources.

See https://bors.tech/ for a similar idea.

What I would object to is changing the 'official' master every few minutes automatically.

If there's no merge conflict (and the tests pass), would the tool also change master to the new merge commit?

Or are all the merge commits just made provisionally to inform you, but don't change master?

>would the tool also change master to the new merge commit

Not in my experience, no.

The goal is just to keep you aware of what problems you're likely to run into if you tried to push for your change to be merged in at the present moment. If these problems are due to work another team is doing the idea is then that you might begin conversations with that other team to discuss how best to sort these things out.

One of the easiest tools in this regard is just a “test” that blocks merging if the request is n commits behind master—just enforcing a rebase whenever you update a pull request. Combine with forbidding fast forward merges for pull requests (just let the code review tool generate a merge commit) and your `git log --graph` becomes much mor bounded in terms of how bad it can get.
You can have branches, but the ""rule"" is that no branch should live for more than, say, a day.

This trades the integration complexity and problems for some new.. challenges :)

Do people start working on something new past 3 PM or do they start discussing what to do the next day?
My team uses this pattern and generally speaking it looks something like this

* everybody is good at breaking down code into changes small enough review quickly but not small enough to become tedious and trivial

* people work on CRs whenever they have time. They do generally not post them after people have started going offline, and wait til next morning as a courtesy, since there is no difference between publishing for review at 5pm vs 9am the next day

One side effect of this workflow is that because the pieces are more manageable, less uninterrupted “flow” time is required to complete a bunch of small things than to make one really big change. And others digest the smaller changes easier and knowledge spreads more effectively. And with the time its easy to say “don’t make people review outside working hours.”

They are not new challenges, they are the same old challenge that led to version control procedures and systems in the first place: your incomplete or broken code is interfering with my attempts to complete or fix mine!
Whether the broken merge happens in a branch or as part of a rebase is just moving things around. The trick is to enforce good code cleanliness when pushing, whenever that may be.

My team does stuff on mainline and requires a new review on rebase.

Let alone pushing every couple of minutes.
So since I was the one who brought it up there's like three parts to this in my mind.

Main caveat is that “prod” is a handful of central places.

So how does code review fit in?

- The first thing is that your main branch needs to be able to tolerate merging incomplete code without breakage. This would probably be resolved with feature toggles, ideally named to match issue/ticket numbers, so that you have to clean it up before closing the ticket. Automatic tests can catch syntax issues but code review would check that your code was indeed either removing a feature toggle (in which case the team should have agreed that the code is mature and turned it on in production) or properly isolated behind a toggle.

- The second thing is that the team needs to be working as a team. This means that the sprint goals are co-owned by multiple people, ideally the whole team takes responsibility to deliver. It never made much sense to me that we segmented out features to individual developers, as if “oh she is out sick this week, well she wasn't working on anything urgent, the work can wait for her,” I understand that this makes performance review slightly easier but I am not convinced by that argument... So what this means for development is that the “war room” that you see for a typical prod outage with a bunch of people chasing down different leads, I want to bring that mentality (without the concomitant stress!) to feature development. So this means that ideally the whole team knows what everyone is working on and code reviews are just part of the organic process of working on a feature together. «Hey you are calling this “accept_states” and I called it “success_states,” I really don't like the fact that you're using “accept” as an imperative verb here, but I could go with “states_to_accept” or “acceptable_states” or we could use “success_states,” which would you rather?» ... You read through each other's code because it legitimately impacts what you are doing elsewhere in the codebase.

- The final ingredient would be release staging. The way a lot of people do branching makes it really easy to forget commits or to roll something back and not rollback the rollback, et cetera. Because we merge everything into main, we get monotonic flow: a commit that is still relevant is merged into main first because everything is, then it gets cherry-picked into the appropriate release branches. (The only exceptions are bugfixing code that is no longer part of the latest version.) The exact tooling to make this doable seems tractable, but perhaps not easy. Per point 1, before the feature branch is permanently flipped, all the code related to it is marked as such and can be merged into a patch release without actually triggering a semver violation (just leave the toggle turned off!)... So in theory the actual feature release is just a single commit. I’d have to see how the rest of this works before I would know exactly how this plays out.

Note that just about every Git-based CI/CD platform does CI/CD incorrectly, things like ACLs and CI/CD config are central statements about who has access to what privileges and should not be branched. A Jenkins which pulls its config from the `ci` or `main` branches will usually save you a lot of headache.

From my experience it only work in CM repos where you have a lot of very small changes and usually unrelated.
"I would have done it from this other approach". I've seen that, and it's not good when you get the feeling of "code review is when someone who hasn't thought about your problem tells you how you should have solved it". People sometimes feel they have to add value as a reviewer, and casually discarding other people's work is the way to do it. Fortunately it's not something I have to deal with at my current job.
Another way to frame it is "code review is not the right place to validate design decisions".

This type of comment is the hardest to make for me as I know it means redoing a large part of the work from scratch. It would generally be avoided by having quick design reviews before starting to code something difficult or involving sweeping changes. Just highlight how you are going to do it in a few sentences.

On the other hand, letting these kind of things go through usually means you will have to deal with the outcome later, and it will be more painful.

Maybe one way to decide about making this call or not is if you find the submitted version bad enough you are ready to implement the alternative yourself, immediately.

And it happens. With junior devs, or because people are new to the codebase.

> Another way to frame it is "code review is not the right place to validate design decisions".

That's a good way to put it. I wonder if draft PRs/MRs could be used for that? Make a draft PR/MR with the design, do a design review, then implement the code and finally the code review.

I’ve made comments that start like that and it’s usually down to

1) obvious code smell, here’s an example using your existing code refactored and the reasons why it is a better fit here

2) you’ve done something I didn’t think of and it’s clearly better than the way I was thinking of it. Here’s why it’s better. Kudos!

Helps that I’m lead/principal in a small biz with like 4 people max writing code. So I kind of know what everyone is working on / what they’re touching with their changes.

Mileage will definitely vary in bigger teams / businesses. 100% helps that my performance isn’t tied to number of PRs reviewed etc.

> While I have never been at a place that did this, I have in my head the idea that the code should be an unfolding collective conversation, kind of like when folks are all collaborating on a shared Google Doc, I see that you are editing this section and I throw in a quick comment “don't forget to add XYZ” and then jump to a different part that I won't be stepping on their toes with.

You just discovered pair programming.

It works astonishingly well. The second pair of eyes not only catches errors and envisions expanded use cases, it also prevents you from shirking off to HN. The biggest problem is you need two people, preferably sitting right next to each other with just one person “driving”.

With the right pair it is unbelievable how much productivity can be unleashed. I concur, it's much easier to stay focused on the task at hand with someone actively working on the same thing.

The reason why this didn't catch on is that it's almost like a Mick and Keith sort of relationship. You can't just take any two musicians and throw them together and get the Rolling Stones and the same thing applies to pair programming.

Take a look at Bell Labs during the birth of UNIX for an entire SWARM of interdependent engineers. There's more than just a small element of good fortune! It seems those people were almost meant for one another.

> You can't just take any two musicians and throw them together and get the Rolling Stones and the same thing applies to pair programming.

I don't see that effective teamwork should necessarily depend on compatible personalities. I think the better comparison would be to surgery teams, or pilot crews, rather than art.

It works well if the tasks are short and well-described. Long, complicated tasks descend into odysseys where one person zones out or gets completely lost as the other person just ends up treating them as a clumsy proxy for the IDE. Furthermore, it rarely works online.
That is my experience as well. Another case where the "clumsy proxy" issue happens is when I help a frontend dev debug something with the backend, especially online. Do collaborative IDEs exist? How usable are they? A collaborative terminal might be useful too (for when I want to run a few commands quickly to check something on my colleague's computer). Thinking about what I just wrote, maybe I could ssh into their machine?
People doubtlessly do it. I sometimes do it when I tutor somebody on a server we both access via SSH. If you use such a server as a jump server, you can setup SSH tunnels to get all the way to another desktop.

I think it would be wonderful to install a remote IDE with workspace on a remote server. Or to VNC/RDP into one.

I'm a principal so I rarely write code now. I probably do 6 PRs a year, basically when its directly related to my expertise and it makes more sense for me to do it. But when I do, I really try to guide the reviewer into my thought process. In your "centralize cache invalidation", I would have in the PR (and probably as comments in the code too) "I moved the cache invalidation to this function to centralizes the logic, it avoids problems X, Y, and Z which we've seen before".

I find that giving a good narrative gets the reviewer thinking like you or at least tells them why you chose one path and not another so that they don't waste time with a "why didn't you do it this way" question.

I keep trying to train my team do to that, but they're so focused on completing tickets they don't want to take the extra time to explain their "whys" and thought processes.

> Suddenly someone is like “I would have done it from this other approach” and you have no idea whether it's tractable

I find myself on the other side of this, all the time.

Suddenly someone comes up with some work that they have done without involving anyone else. I know this codebase, and I know moving that cache invalidation will make the codebase harder to understand, and also runs counter to the multi month effort you had to move all cache handling to the shared library which every other codebase in the product uses.

I try to be very careful about it, "have you considered using this other approach instead?" usually means unless you do this there will be the need for an even bigger rewrite in the future, but nobody can really take comfort in that.

The answer is that post facto code review is really unsuited for collaborative development. You start some work, you better involve other people straight away. Bounce ideas with a partner or two that really knows the codebase so you both understand what should be done and why. Unless you agree what it is your suggested change should achieve, there can be no useful code review!

When you are very a tiny team this mostly follows naturally, but always gets lost when the team grows and are split in several, and when individual developers starts to lose track of the codebase as a whole.

Those are good things to consider in review, but I maintain that the answer might be "no" to one or more of those questions and still be acceptable.

I'm old enough to have worked in the pre-code-review era. Things were fine. People still learned from each other, software could still be great or terrible, etc. It wasn't appreciably worse or better than things are today.

> An implicit question in several of the above is "will this set a good example for future contributions?"

Which in my experience can be an almost circular requirement. What do you consider a good example? As perfect as perfect can be? Rapid development? Extreme pragmatism?

The more experienced I get, the less I complain about in code review, especially when reviewing for a more junior dev, and especially for frequent comitters. People can only get so much out of any single code review, and any single commit can only do so much damage.

Put another way, code review is also about a level of trust. Will the committer be around next week? Are they on the same team as me? If yes, give them some leeway to commit incremental work and make improvements later. Not all incremental work need occur pre-commit. Mention areas for improvement, sure, but don't go overboard as a gatekeeper.

Things are obviously going to be different when reviewing code from what amounts to a stranger on a mission critical piece of code, etc.

> Put another way, code review is also about a level of trust. Will the committer be around next week? Are they on the same team as me? If yes, give them some leeway to commit incremental work and make improvements later. Not all incremental work need occur pre-commit. Mention areas for improvement, sure, but don't go overboard as a gatekeeper.

I think this is very important, especially the part about incremental improvement. too many see development as laying concrete where it has to be perfect rather than as an ongoing process.

and personally the only thing I find PR's good for is ensuring jackasses aren't doing stupid shit. And by stupid shit here I mean things like using floats for currency (I caught that w/i the last year), things of that nature.

But my preference is to work with people I can trust and at that point I don't give a crap about a PR or a code review.

As a counterpoint, individual code ownership can be a fantastic model too. It lets engineers specialize and takes advantage of social systems that form naturally anyways. I’ve not personally seen group ownership work well, and in practice, it’s still an individual who knows a given area the best.
> reducing individual code ownership

I am now working in an organization that is set up to reduce code ownership, and they struggle to attract talent, although pay is good and work is fulfilling.

How do they do reduce individual code ownership? Horizontal integration. Developers code, analysts design DB structures (at least nominally), project managers set up meetings. Different silos exist for CICD, cloud roles, core teams.

There are vetting committees everywhere that have the last say on the libraries used and the nitty-gritty details of REST APIs and naming.

It exhilarating to start projects, then see them degrade inevitably into corporate monstrosities.

> an organization that is set up to reduce code ownership, and they struggle to attract talent

These might not be related though?

> work is fulfilling

> It exhilarating to start projects, then see them degrade inevitably into corporate monstrosities.

What you describe does not sound pleasant. So it's not fulfilling after all?

Projects are fulfilling because they have public utility.

The no individual code ownership policy is hard to bear for inquisitive minds, though.

Thus the talent shortage.

You assume projects are fulfilling because they have "public utility" (whatever the fuck that means). That's your assumption, not objective fact.

I would not work on boring ass project with cripping management problems and think that's fulfilling. I assume you have recruiting problems because most people share that sentiment.

People like to work on interesting stuff without much office politics, that is "fulfilling" to them, even if outcome is app that would be "boring" to the outsider.

> That's your assumption, not objective fact.

I meant working for emeregency services, social security, global medical dossiers, etc This is objectively public utility.

I feel the post you are replying to is really advocating for shared knowledge, which is entirely compatible with the positive aspects of ownership. Personally, I have seen nothing but trouble from people whose distorted concept of ownership opposes sharing knowledge of "their" stuff.
I am ok with code "ownership" but if someone other than me can just outright reject pull merge requests without explaining why then I should not have to read this code or make changes. If you own it, you fix it. What needs to fix? Figure it out. Don't ask me. You are the owner.
Well put. I'd like to add that individual code ownership must not necessary be lived as a loner sitting on that code. Being an owner means hatching, keeping clean and also knowing when getting help is beneficial. Accepting your bias and asking for a look from a different perspective is not getting rid of or dispersing ownership in my view, it is a central part of it.

Often this includes getting feedback if that change can make problems outside the feature implemented or in the future, like introducing dependencies that cost more in total than they help locally.

This is an ideal (if not utopian) vision of software review (though please let's handle style checking automatically already!) It implicitly requires, however, such a thorough examination of the code that diffs would seem to be an irrelevant distraction.

In practice, however, starting from the changes in a system that was previously working well enough is a very effective way of focusing limited human attention on where the problems are likely to be, optimizing for error detection with limited resources over the broader knowledge-dissemination goals espoused in this approach. If we are going to use diffs, then this brings us back around to the topic if the article.

Personally, I find having any form of diff embedded in what I am trying to understand just makes it harder to follow, so I move the diffs onto a secondary screen and use it as a guide and reference to what has been changed. The author of the article seems to want the same, but the mock-up is somewhat misleading, as it only has substitutions and additions. Where something is removed or refactored to another place, the two views will no longer line up as depicted here.

Code style check should be automated. I do not accept discussions on style in PR in my team besides pointing out obvious deviations which should be automated. Discussion on that should be way before making PR.

Architectural changes/discussion should be discussed by developers way before PR on slack or in a call. Most features should not change architecture and team should make effort to align architecture all the time at least before someone makes PR. Unless of course PR is PoC to showcase approach.

As I have said - every team is different :)

In response to your first point I guess things really depend on where you store PR metadata and how ephemeral/permanent it is. Some teams store that information in the ticket, some fill implementation notes with change request, some add that to the PR, some discuss in their standup (or similar) meeting.

Regarding 2-3, you are right, I just lumped them all under umbrella term.

Maybe we could use terms like "PR review" and "code review". The former is a shallow LGTM check, while the latter may involve code checkout, poking around, architectural discussions, pair/team programming and so on. In my book they are entirely different beasts and web tools are geared (not without justification) towards the former, where both types of diff should serve the purpose.

To me this happens naturally, Looking at the diff first, if I know the codebase well might be enough to get a sense of the changes, or at least understand their isolation. However, as soon as I start to feel that I don't fully understand the implications, I quickly check the branch out and poke in an editor.

OP is right though, if the "check out in editor" workflow was much smoother (than quick web view) I would prefer to always do that

This is why you should insist on a changes being reviewed by at least two pairs of eyes. Someone responsible for that area of the code might raise those kinds of objections. Someone who is more of an outsider can focus on other problems.

E.g. I'm kind of a human compiler; I can say things like, you have undefined behavior here, in a completely unfamiliar code base where the maintainer of that might miss the issue, too busy pointing out that it doesn't respect architectural conventions, and is not tested.