Hacker News new | ask | show | jobs
by tigershark 3647 days ago
I completely disagree, every method can be split in private methods. In that way you don't need awful and unhelpful comments in the middle because you can understand what it does simply from the method name.
5 comments

I've gone back and forth on this one over the years. My current advice would be that if you can find something that is naturally a sub-function, factor it out as one. Keep it private initially, but do not do this if that private function makes absolutely no sense on its own and your public code isn't calling it from more than one site.

If you factor things out into sub-functions that have no semantic meaning on their own then all you are doing is making the code harder to understand. You're also making it harder to maintain because of all the extra state that will need to be passed around between the sub-functions, which may change later.

For large complex functions that cannot be broken up sensibly, the paragraph splitting method is exactly what I use. Such large functions are a code smell and you should think carefully about whether it really does need to be so large, but there are indeed cases where it is the best option. Nobody should get too attached to dogmatic rules like "no function may exceed 25 lines".

The style that I prefer is slightly different to the GP though. I usually put about a paragraph of comments at the top explaining why the function is so long, giving an overview of the algorithm and other information that is inappropriate for a JavaDoc-style comment (since those are for the function consumer rather than a maintainer). I then give each "paragraph" of code a section heading and sometimes number these (especially if I have written out the algorithm in line-numbered pseudocode in the explanatory comment).

Totally agree. While I respect and try to adhere to the organization and naming philosophies of the "Clean Code" approach, I've also been bitten later when going over something that I hadn't worked on in months. I concluded that my brain has a state as well as the program, and comments are for reloading my brain state. Only when my brain is in the correct state do my names make complete sense. And still, unless I had ample time to make sure everything is perfectly coherent (how many times does that happen?) I'm still sometimes left wondering.

A comment that clearly explains "why," and sometimes even "how," is extremely helpful no matter how Hemmingway I think I am in the moment. And I also find chasing little factored-out bits of functions all over the place is tedious- like reading a paragraph of prose where a lot of the true meaning is delegated to footnotes. You're either skipping it, or constantly stopping and looking for the damn footnote. In code, sometimes that means grepping the project to find the damn thing. If it's small and used once, consider leaving it in.

Every method can be split sensibly. Using the functional paradigm it becomes natural to understand it because you usually work in the opposite way using composition of functions rather than dictating what happens in an imperative way. And large, complex functions do increase the cognitive load. I personally abhor regions or sections because most of the time they can go in a separate method and they just break the code-flow with something completely unrelated.
Methods can be split into private methods. But what that sometimes means is, when you're trying to understand how a piece of information flows through a method to fix a bug, instead of it being all self-contained in the method, instead you have to jump around to 10 places in a file to see where the information goes. That severely increases the cognitive overhead.
There is also a cognitive overhead when trying to find local variables defined in huge functions, at least when they are split into smaller private (static) methods (with variables passed as arguments) it's easier to track.

Both in the parent method and in the encapsulated child.

> you can understand what it does simply from the method name

That can be tough sometimes. How do you handle the case where you've created a function just to package some block of code that would otherwise be repeated 40 times? You end up with function names like

    add_to_list_when_cromulent() 
or even worse

    rebuild_stats_helper()
Or you have the situation where every time you do action A, it usually needs to be followed with action B. Because you want functions to do one thing only you have two functions action_A() and action_B(). But since you are always going to do them in pairs, you end up with a group action_A_and_B() that just calls the two functions sequentially.

I think I've settled on helper functions that are static or in anonymous namespaces (I work in C++) whenever possible.

The first case is a self-explicative one liner, it doesn't need an external function.

    var evenNumbers = Enumerable.Range(0, 40).Where(i => i%2 == 0).Select(i => i);
Private methods must group non-elementary steps. The second case is a perfect example that explains why it is so much better to use a private method because now you have a method that groups both sub-methods and you don't need to call always two different methods in a bunch of places. In this way you decreased code duplication increasing clarity.

    private void ProcessLogs()
    {
        var events = _logProvider.Read();
        events.Where(e => e.IsError).ForEach(SendErrorAlert);
        events.Where(e => e.IsWarning).ForEach(SendWarningAlert);
    }
And the method name seems quite self-explanatory to me.

If you need to separate one method in paragraphs, and add comments that explain what each paragraph does then that code is SCREAMING for a refactoring, deleting all the useless comments and extracting that mess in properly structured Classes/Methods.

I see this all the time in our PRs:

   UtilHelper.processItems(List<object> items) : List<object>
Here's John Carmack's take on the issue:

http://number-none.com/blow/john_carmack_on_inlined_code.htm...

TL;DR: He's in favor of inlining functions.

Actually, if you are inlining, he says: "you should be made constantly aware of the full horror of what you are doing."

I wouldn't say he's in favor of it, he actually appears to be advocating for a pure FP approach. But if you have to, inlining is OK, with the quoted caveat.

That's literally the first paragraph in the article, keep reading. Also, I think you misunderstood it.

> if you are going to make a lot of state changes, having them all happen inline does have advantages; you should be made constantly aware of the full horror of what you are doing.

The horror he refers to is not inlining, it's dealing with stateful logic. If you are doing state it's better to be aware of the horror making it explicit by inlining rather than hiding the state changes via passing and receiving in functions.

I'm pretty sure he's advocating for inlining one-off functions.

I agree and disagree with you :). Yes, I agree, he's advocating in-lining one-off functions, and I can see both points of view from that.

Having said that, how often do you have a large method where it isn't modifying a lot of state? I guess it depends on the context like a lot of things, but it seems like that's what is happening in the code I deal with (large methods changing state as well as having a lot of dependencies).

But the "article" isn't an article exactly, it's a commentary of an email that's 7 years old. The original email seems to completely advocate in-lining; the "article" part backtracks somewhat, and that's what my comment was around. I'm not always the clearest in my writing though.

sorry, but given this pre-condition (from gp):

> ... a large function cannot be broken up usefully, because a lot of state needs to be shared between the different parts ...

there is no way to break that up with multiple smaller functions without stowing away the state somewhere.

sometimes reading a large function is not half as bad as reading 10 different ones with each altering the shared state.

state can be passed through as arguments...
When coding in C, I often package the state that's shared between a high-level function and its local subroutines in a local "struct context" that's instantiated in the high-level function and then passed by address as the first argument to the subroutines. Makes it easy to see what the shared state is, and adding/changing the shared state doesn't require changing all the formal and actual argument lists.
Me too. OOP is actually having several such states. My go to pattern is to root states in an application tree.
Sure, but then there are more questions :) e.g. how many parameters ? 3, 4 ... ? what if they are of the same type ? would you change your numbers then ? users can get the order wrong etc. another thing : if you pass too many parameters, isn't that a hint to the fact that something is amiss ?

edit-1 : fixed typo

Remember this is in the case of spiting out smaller functions to improve readability.

Variable naming will only add to that, as you can rename variables in the extracted methods for their local purpose.

If you have too many arguments you can package them up in a object/map/tuple of your choosing (depending on language).

(Reverting to local state would in my opinion increase confusion and decrease readability.)

State can be stored in instance variables as well, and should be if many small functions share them, it's what objects are for.
> State can be stored in instance variables as well...

but then we are back to the beginning of the thread...

No, we aren't, since objects are the proper way in this case to reduce cognitive load. This notion that all the code has to be in one method to be understood is symptomatic of programmers who don't know how to factor correctly thus leading to the necessity of having to read a methods implementation to understand what it does, i.e. they're bad programmers blaming their shortcomings on well factored code rather than learning how to read and write well factored code correctly. Breaking down large methods into numerous small ones makes the code easier to read and understand, not harder, unless you do it entirely wrong.