Hacker News new | ask | show | jobs
by NotBoolean 546 days ago
I don’t have your experience but I personally think some of this feedback can be warranted.

> Can't refactor code because it changes too many files and too many lines.

This really depends on the change. If you are just doing a mass rename like updating a function signature, fair enough but if you changing a lot of code it’s very hard to review it. Lots of cognitive load on the reviewer who might not have the same understanding of codebase as you.

> Can't commit large chunks of well tested code that 'Does feature X', because... too many files and too many lines.

Same as the above, reviewing is hard and more code means people get lazy and bored. Just because the code is tested doesn’t mean it’s correct, just means it passes tests.

> Have to split everything down into a long sequence of consecutive pull requests that become a process nightmare in its own right

This is planning issue, if you correctly size tickets you aren’t going to end up in messy situations as often.

> The documentation comments gets nitpicked to death with mostly useless comments about not having periods at the ends of lines

Having correctly written documentation is important. It can live a long time and if you don’t keep an eye on it can becomes a mess. Ideally you should review it before you submitting it to avoid these issues.

> End up having to explain every little detail throughout the function as if I'm trying to produce a lecture, things like `/* loop until not valid */ while (!valid) {...` seemed to be what they wanted, but to me it made no sense what so ever to even have that comment

I definitely agree with this one. Superfluous comments are a waste of time.

Obviously this is just my option and you can take things too far but I do think that making code reviewable (by making it small) goes a long way. No one wants to review 1000s lines of code at once. It’s too much to process and people will do a worse job.

Happy to hear your thoughts.

4 comments

> This is planning issue, if you correctly size tickets you aren’t going to end up in messy situations as often.

No, it’s “this refactor looks very different to the original code because the original code thought it was doing two different things and it’s only by stepping through it with real customer data that you realized with the right inputs (not documented) it could do a third thing (not documented) that had very important “side effects” and was a no-op in the original code flow. Yea, it touches a lot of files. Ok, yea, I can break it up step by step, and wait a few days between approval for each of them so that you never have to actually understand what just happened”.

so, it's not just a refactoring then; it's also bug fixes + refactoring. In my experience, those are the worst PRs to review. Either just fix the bugs, or just refactor it. Don't do both because now I have to spend more time checking the bugs you claim to fix AND your refactoring for new bugs.
There are certainly classes of bugs for which refactoring is the path of lowest resistance
The most common IME are bugs that come from some wrong conceptual understanding underpinning the code. Rewriting the code with a correct conceptual understanding automatically fixes the bugs.
The classic example of this is concurrency errors or data corruption related to multiple non-atomic writes.
And there are multi-PR processes that can be followed to most successfully convert those changes in a comprehensible way.

It'll often include extra scaffolding and / or extra classes and then renaming those classes to match the old classes' name after you're done, to reduce future cognitive load.

I'm unconvinced that adding extra code churn in order to split up a refactor that fixes bugs into a bugfix and a refactor is worthwhile
One metric I like to give my team is to have any new PR start a review in less than 15 minutes and be completed within 15 minutes. So, the longest you should wait is about 30 minutes for a review. That means teams either go "fuck it" and rubber stamp massive PRs -- which is a whole different issue -- or they take it seriously and keep PRs small to get their PRs reviewed in less than 30 minutes.

In most cases where I see responses like this, they're not surprised to wait hours or days for a PR review. In that case, it makes sense to go big, otherwise you'll never get anything done. If you only have to wait half an hour, max, for a PR review; the extra code churn is 1000% worth it.

No, I very deliberately did not describe any bug fixes.
> only by stepping through it with real customer data that you realized with the right inputs (not documented) it could do a third thing (not documented) that had very important “side effects” and was a no-op in the original code flow

sounds like the 'nightmare' was already there, not in the refactor. First step should be some tests to confirm the undocumented behaviour.

Some of your complaints seem to be about peer review ('approval'). I found my work life improved a lot once I embraced async review as a feature, not a bug.

As for 'break it up step by step' - I know how much I appreciate reviewing a feature that is well presented in this way, and so I've got good at rearranging my work (when necessary) to facilitate smooth reviews.

> sounds like the 'nightmare' was already there, not in the refactor

I admit that I am pretty allergic to people who avoid working with imperfect code.

The way I normally approach this is one big pr for context and then break it into lots of small ones for review.
I've found processes like this to work better, too. Basically, the one big pr is like building a prototype to throw away. And the benefit is it has to get thrown away because the PR will never pass review.
A PR with self-contained smaller commits would be possible as well.
Yes, though it does depend on how good the commenting system is; and, for something like that, you're still probably going to want a meeting to walk people through such a huge change.

And you'd better hope you're not squashing that monstrous thing when you're done.

I do object to the notion of something being a planning issue when you're talking about a days worth of work.

Implement X, needs Y and Z, ok that was straightforward, also discovered U and V on the way and sorted that out, here's a pull request that neatly wraps it up.

Which subsequently gets turned into a multi-week process, going back & forth almost every day, meaning I can't move on to the next thing, meanwhile I'm looking at the cumulative hourly wages of everybody involved and the cost is... shocking.

Death by process IHMO.

> Implement X, needs Y and Z, ok that was straightforward, also discovered U and V on the way and sorted that out, here's a pull request that neatly wraps it up

This sounds very difficult to review to be honest. At a minimum unrelated changes should be in their own pull request (U and V in your example).

I work as a tech lead, so I get a lot of leeway in setting process. For small PRs, we use the normal “leave comments, resolve comments” approach. For large PRs, we schedule 30m meetings, where the submitter can explain the changes and answer questions, and record any feedback. This ensures everyone is on the same page with the changes, gives folks a chance to rapidly gather feedback, and helps familiarize devs who do not work in that area with what is going on. If the meeting is insufficient to feel like everyone is on the same page and approves the changes, we schedule another one.

These are some of the best meetings we have. They are targeted, educational, and ensure we don’t have long delays waiting for code to go in. Instead of requiring every PR to be small, which has a high cost, I recommend doing this for large/complex projects.

One additional thing to note on small PRs: often, they require significant context, which could take hours or even days, to be built up repeatedly. Contrast that with being able to establish context, and then solve several large problems all at once. The latter is more efficient, so if it can be enabled without negative side effects, it is really valuable.

I want my team to be productive, and I want to empower them to improve the codebase whenever they see an opportunity, even if it is not related to their immediate task.

One minor piece of insight from me is about release management vs pull-requests.

As you say it's much easier to schedule a 30 minute meeting, then we can - with context - resolve any immediate nitpicks you have, but we can also structure bigger things.

'Would this block a release?'

'Can we just get this done in the PR and merge it'

'Ok, so when it's done... what is the most important thing that we need to document?'

Where the fact that even after it's merged, it's going to sit in the repo for a while until we decide to hit the 'release' button', this lets people defer stuff to work on next and defines a clear line of 'good enough'

How do you rework a core process, then? If you rework a major unit that touches just about everything... Sharding something like that can break the actual improvement it is trying to deliver.

Like... Increase the performance of a central VM. You'll touch every part of the code, but probably also build a new compiler analysis system. The system is seperate to existing code, but useless without the core changes. Seperating the two can ruin the optimisation meant to be delivered, because the context is no longer front and center. Allowing more quibling to degrade the changes.

Agree. Another item here that is contextual: what is the cost of a bug? Does it cost millions, do we find that out immediately, or does it take months? Or does it not really matter, and when we’ll find the big it will be cheap? The OP joining a new company might not have the context that existing employees have about why we’re being cautious/clear about what we’re changing as opposed to smuggling in refactors in the same PR as a feature change.

I’m going to be the guy that is asking for a refactor to be in a separate commit/PR from the feature and clearly marked.

It doesn’t justify everything else he mentioned (especially the comments piece) but once you get used to this it doesn’t need to extend timelines.

Yes, wrapping other discoveries into your feature work is a planning issue that might impact on the review burden.
> This is planning issue, if you correctly size tickets you aren’t going to end up in messy situations as often.

I think the underlying issue is what is an appropriate “unit of work”. Parent commenter may want to ship a complete/entire feature in one MR. Ticketing obsessed people will have some other metric. Merge process may be broken in this aspect. I would rather explain to reviewer to bring them up to speed on the changes to make their cognitive load easier

This. The solution to long and multiple reviews to MR is single pair review session where most of the big picture aspects can be addressed immediately and verbally discussed and challenged.

IMHO it is the same as chat. If talking about an issue over mail or chat takes more than 3-5 messages, trigger a call to solve it face to face.

code reviews that are too small, i think are worse than ones that are too big, and let through more bugs.

10 different reviewers can each look at a 100 lin change out of the 1000 line total change, but each miss how the changes work together.

theyre all lying by approving, since they dont have the right context to approve