Hacker News new | ask | show | jobs
by golergka 1121 days ago
That's literally 300 something lines. I'm buffled, 5k PRs aren't that rare at work.
7 comments

You genuinely review 5k LOC PRs?

If I was doing a proper review of a PR that big and making sure you understand how everything works that would easily take multiple days and generate 10s-100s of comments.

In reality I would flat out refuse to review it. Even 1k is very large without being broken down.

The only time I see PRs that big at work are cleanups (E.g. deleting whole directories), automated linting changes across the whole codebase and large structural refactors (E.g. changing directory structure).

Almost every time I’ve reviewed a 1k+ LOC PR, even if it’s from a really experienced and good engineer, it has introduced a bug. Obviously I’m not gonna say it at work, but changes that big are too hard to properly review and consider all side effects and gotchas
It's also more difficult to bisect to find the actual bug. Small commits have a lot of advantages.
What is being changed matters.

To use an analogy: if you wanted to reupholster my car seats, or spice up the radio panel, sure knock yourself out - I'll come check when it's done. But if you were to as much as think about tightening or loosening a single screw anywhere near the engine block, believe me I will be paying very close attention to what you're changing.

That would be a reason to never accept the PR; not to auto-accept it.
I’d hate to work there. I’d rather review small chunks and merge often than review 5k lines that could be built on a shaky foundation/idea
Best team I've ever worked with. 5k was on the larger side, sure, but technically difficult tasks often just can't be split into a series of smaller pull requests.
I am always amazed that many people with significant experience are so resistant to this idea. For what it's worth, my experience matches yours entirely: many significant changes can't be meaningfully committed in small chunks, they only make sense as an all in one.

And even more so, I've often seen the opposite problem: people committing small chunks of a big feature that individually look good, but end up being a huge mess when the whole feature is available. I hate seeing PRs that add a field or method here and there (backwards compatible!) without actually using them for now, only to later find out that they've dispersed state for what should have been one operation over 5 different objects or something.

What changes can't be committed in <5k LOC? That's a shit ton of code. If you can't break that down into smaller shippable chunks there's probably something wrong, or you're building something extraordinarily complex.

It's definitely overall quicker to ship like this, but there are tradeoffs. You are effectively working independently from the rest of your team, there is no context sharing and everything is delivered at once after a longer period of time.

Here's a personal example from work:

We're performing atomistic simulations. The first edition of the code stored each atom on the heap and had a vector full of pointers to the individual atoms. Obviously this would obliterate the cache, so I crafted a PR to simply store all the atoms in a single vector. On its own, that was a one line change, but it was also a very fundamental change to the type system. Everything as simple as

    Atom* linker = atoms[index];
    linker->x += 1.57;
Suddenly had to be

    Atom& linker = atoms[index];
    linker.x += 1.57;
If I didn't make those corresponding changes, the code wouldn't type check and the build would fail. I think the final PR came out to about 17 kLOC.
The OP is talking about feature work, as am I.

Obviously if you make a change to something like your type system it's going to generate a very large diff, but you also aren't going to review the full diff.

You're just going to make the change with find+replace or some other automation then write in the PR description "I made this change to the type system". No one is actually reviewing 17k LOC.

This isn't what the op is talking about. They're talking about a net new feature that would span 5k lines. Your change is trivial compared to it, and frankly would earn approvals immediately without much thought (assuming the changes were already planned and talked about)
I've experienced this changing the API for a primary memory allocator in a frequently updated code-base. Each location updated was perilous and it needed to be changed in bulk to avoid an endless war of attrition.
We have a small feature that was put together in a rush. It "works", but only works correctly for some simple cases, otherwise there are bugs everywhere. There were very few tests. We must redo the while thing, update the UX and add tests. Tell me how we can achieve that without replacing all existing code and add new tests that cover every use case in the same change.
Why do you have to do it all at once? You can't improve the codebase piece by piece?

Shipping all the changes in a monster PR is usually not a good option. One big reason why is that you do not create any value until the whole thing ships. If you ship piece by piece you can create small amounts of value every time you ship.

Also, if it's a "small feature", why is it 5k+ LOC?

Upgrading versions of a framework with breaking changes is one. I did a 12k PR a few weeks ago that touched almost every file of the app.

It was an all or nothing thing, as almost every third-party dependency we used had to be swapped by something else.

At least we have amazing test coverage, so it was easy to find bugs.

But there wasn't much I could say other than "trust me".

I think if a review is very large, the owner of the code should do a code walkthrough for the whole team.
>many significant changes can't be meaningfully committed in small chunks

They almost always can. The exceptions are stuff like autogenerated code or updating a dependency.

We as a team sometimes decided to PR into a PR branch, not as meticulous as a PR to develop but there'd still be eyes on the code entering the branch. Especially useful when there are dependencies and/or different disciplines contributing to the feature.
As if you are reviewing 5k lines, though. It's either 90% whitespace changes that you completely skim over (probably missing the one bit that actually changed) or you just skim over it looking for a few patterns you don't like such as using a loop instead of Array.map or something.
When they say 5k lines, I'm assuming they mean 5k lines of meaningful code. 5k lines with whitespace changes, linting etc. would be impossible to meaningfully review
This is so true.
> at work

Level of trust with colleagues will hopefully be higher! And individual ownership of and responsibility for the code probably lower.

Which is the philosophy behind Gerrit as opposed to Github.
Well, I feel like "at work" is a keyword here.
What language/stack?
Typescript, React, Monaco, Treesitter, etc.