Hacker News new | ask | show | jobs
by NickM 859 days ago
Okay, let's try for a more concrete example. Suppose you're writing a new feature, and the bulk of your PR is a one-off 1,000-line function that does ten different things in sequence.

If you can factor out that big long function into ten smaller functions that are each useful in isolation, then your code is probably going to be a lot easier to understand, regardless of how your structure your commit history. Future code reuse is also easier. Testing each function in isolation may be simpler as well.

Once you do all of that, putting each of the standalone functions in its own commit is trivial, and each commit can have a more focused commit message. Reviews are easier too, since you can easily view each function and its associated docs/tests/etc separately.

Local optimization on these sorts of metrics are pretty much always going to lead to perverse outcomes.

Again, this is a straw man. Nobody is saying to blindly optimize to the metrics, or that this is a black-and-white thing where every commit needs to fit within a hard line limit. But, all else being equal, I've personally found that my code quality improves when I try to break things into smaller independent changes. I also find it much easier to review PRs where other people do that.

2 comments

Much more common in my experience is a 10 line function that relies on 30 lines of new SQL schema, both of which are used in an 80 line report.

Context matters. Reviewing any one of the 3 without simultaneously looking at the other 2 is folly.

I'm not arguing that big PRs are better. I'm arguing that being smaller also doesn't make it better. Neither has any real meaning without context.

Take your same 1000 line change example, but make that change span multiple areas of the code that are related, but not immediately apparent. If you blindly split that up, you can very easily let bugs slip in even if in isolation each change looked fine.

My whole point is: there is no "ideal" size. It depends. It always depends. We should never substitute thinking with mere rules of thumb and reductionist claims.

On another point, notice you said: commits, not pull requests.

That's an important distinction. Pull requests have a whole feedback loop on top of them, and serve exactly as a way to tie multiple smaller changes required to achieve one meaningful change. That means most of the time we can get all the benefits of small code changes and the benefits of the full context by having a slightly larger PR and separating changes by commits.

> Again, this is a straw man. Nobody is saying to blindly optimize to the metrics

Maybe. But this is an advertisement blog post, made for a company that targets managers that will say "let's buy this and it'll make our developers faster. It gives me a pretty graph to say who's good".

That's why some of it might look like a straw man at first sight, but if you take in the context in which this blog post was written, it's a valid argument in my view.

It's way too easy to blur real code concerns (splitting functions, naming, separation of concerns, etc) with silly meaningless numbers (PR size, line counts, PR numbers) that managers use as a proxy to actually knowing the subject matter.