Hacker News new | ask | show | jobs
by scotty79 1242 days ago
When there are a few they can be really great. But if you need to accurately name every single intermediate thing they can become visual noise that hides what happens.
2 comments

I struggle to think of real-world examples where I've just needed to chain and chain and chain values of different types more than a handful of times. The claimed need for the pipe operator is this construction:

    function bakeCake() {
      return separateFromPan(coolOff(bake(pour(mix(gatherIngredients(), bowl), pan), 350, 45), 30));
    }
The piped code looks like:

    function bakeCake() {
      return gatherIngredients()
        |> mix(%, bowl)
        |> pour(%, pan)
        |> bake(%, 350, 45)
        |> coolOff(%)
        |> separateFromPan(%)
       ;
Which is... fine? It certainly looks better than the mess we started with, but adding names here only helps clarify each step.

    function bakeCake() {
      const ingredients = gatherIngredients();
      const batter = mix(ingredients);
      const batterInPan = pour(batter, pan);
      const bakedCake = bake(batterInPan, 350, 45);
      const cooledCake = coolOff(bakedInPan);
      return separateFromPan(cooledCake);
    }
Even if you consider the `const` to be visual noise, the names are useful. At any point you can understand the goal of the code on the right-hand side by looking at the name of the variable on the left-hand side. You can also visually scan the right-hand side and see the processing steps. You can also introduce new steps to the control flow at any point and understand what the data should look like both before and after your new step.

I agree that the the control flow is more clearly elucidated in the pipe operator example, but it tosses away useful information about the state that the named variables contain. It also introduces two new syntactical concepts for your brain to interpret (the pipe operator and the value placeholder). I contend the cognitive load is no greater in the example with names, and the maintainability is greatly improved.

If you have an example where there are dozens of steps to the control flow with no break, I'd be really curious to see it.

Imagine that you asked someone the question "How do you make a cake?" Which response would be clearer?

1. Gather the ingredients, mix them in a bowl, pour into a pan, bake at 350 degrees for 45 minutes, let it cool off and then separate it from the pan.

2. Get ingredients by gathering the ingredients. Make batter by mixing the ingredients. Make batter in a pan by pouring the batter in a pan. Make a baked cake by baking the batter in the pan at 350 degrees for 45 minutes. Make a cooled cake by cooling the baked cake. Separate it from the pan.

For me personally #1 is more readable because #2 is unnecessarily bloated with redundantly described subjects.

Right, it works for your analogy.

Going back to the concrete scenario GP presented, naming things makes it much clearer to me.

In fact, I'm so fanatical about naming things, I'd probably give the two magic numbers and the return value names as well:

    function bakeCake() {
      const bakeTemperature = 450;
      const bakeTime = 45;  // minutes
      // ... 
      const bakedCake = bake(batterInPan, bakeTemperature, bakeTime);
      // ...
      const finishedCake = separateFromPan(cooledCake);
      return finishedCake;
    }
And I'd not look at a code review which quibbled about the particular names I chose as being a waste of time either. Time spent in naming things well is the opposite of technical debt, it's technical investment. It pays dividends down the road. It increases velocity. It makes refactoring easier. It improves debuggability. It makes unit tests easier to see.
Should make it an async function, and await the bake step. ;-)
Sometimes intermediate values either don't have domain specific meanings or the meaning is obvious from the function name that returns this temporary value.

Then naming it is just noise.

If your bake() function was rather named createBakedCake() than naming returned value bakedCake just increses reader fatigue through repetition.

Same way

Random random = new Random();

in C# is worse than

var random = Random();

> Sometimes intermediate values either don't have specific meanings or the meaning is obvious from the function name that returns this temporary value.

I don't necessarily disagree with this. But even granting that this is true: congrats, you've just found the worst part of giving these intermediate steps a name! Like, that's the worst case example of the cost side of the tradeoff we're discussing here. And it's not that big a cost! Like, of all the code you write, how much of it fits this case? Where you're writing a function where there's a lot of sequential processing steps in a row with no other logic between the steps AND the intermediate state doesn't have any particular meaning?

In that worst case, you have a little extra information available (like your Random random = new Random()) example that your eyes need to glide past.

I would wager your brain is more used to scanning your eyes past unnecessary information and can do that with less effort and attention than it can either:

    - bounce back and forth between the chained function calls of the original nested example.
    - synthesize the type and expectations of the intermediate value at any arbitrary point in the piped call chain.
That last thing is the big cost of not naming things. In order to figure out what the value should look like at step 4, you have to work backwards through steps 1-3 again. And you have to do that any time you are debugging, refactoring, unit testing, adding new steps, removing existing steps, etc.

And the work to come up with "obvious" names isn't hard. Start with the easy name:

    batterInPan = pour(batter, pan)
And if the name batterInPan never gets any better and never really helps anyone read or debug or refactor or unit test this code, then in that sense, I guess it's a "waste". I just claim that this case is far less common in the real world and far less costly than having to untangle a mess of unnamed nested or chained call values.

Or maybe you want to just start with the unnamed nested or chained calls, and when you need to read or debug or refactor or test your code you pay the "naming things" price tag at that point. That's often the first thing I do when I come across code with a dearth of names, I just give everything a boring, uncreative temporary name, and then I can do whatever work I showed up to this code to do. It's not ideal, but it's better than every JS library sprinkling a new bit of syntax in just so they can avoid giving their variables names and can use an overloaded modulo operator instead.

> But even granting that this is true: congrats, you've just found the worst part of giving these intermediate steps a name!

Yes. But given that people would usually put you on a stake for naming function bake() because it doesn't tell anything about what the function expects or returns and bare minimum about what it does, this use case scenario is what happens very often, because naming your function in a very informative manner is very important because they are a part of the API.

If you really have functions like bake() or pour() in your code esp in weakly typed language then for the love of God, yes, please name the variables that you pass there and get from them always and as verbosely as possible.

Don't get me wrong, I'm very fond of naming intermediate things too. And with helpful IDE it can even tell you the types of intermediate things so you can better understand the transformations that the data undergoes as it flows through the pipeline.

But sometimes type, that IDE could show also automatically in |> syntax is even more important than the name for understanding. VS Code does something like that for Rust for chaining method calls with a dot. Once you split dot-chain into multiple lines it shows you what is the type of the value produced in each line.

My personal objection to naming temporary values too much in a pipeline is that it obscures distinction between what's processed and what are just options of the each processing step. But I suppose you might keep track of it by prefixing names of temporary values with something.

> Or maybe you want to just start with the unnamed nested or chained calls, and when you need to read or debug or refactor or test your code you pay the "naming things" price tag at that point.

Yeah, that's usually what I do. I start with chains and split them and pay for the names as I go.

> That's often the first thing I do when I come across code with a dearth of names, I just give everything a boring, uncreative temporary name, and then I can do whatever work I showed up to this code to do.

I'm also splitting and naming stuff in that case and checking types along the way. But I prefer that to encountering the code named verbosely and wrongly. Then I need to get rid of the names first to see the flow then split it again sensibly. Of course I don't usually commit those changes in shared environments. Only in owned, inherited ones or if the point of my change is to refactor.

Granted that chaining class member accessor mostly covers up this problem of naming intermediate things if you use classes. That's why we even survived without pipe syntax. But since we would like to move away from classes a bit to explore other paradigms maybe it's time?

Also the second example is easier to manipulate. You can hack in branches, logging etc. during development. I'm also not sure how the proposal tries to solve the problem that we can't easily pluck out members from an object in the first example. Will people just write something like `get(obj, "member")`? Or maybe they thought about this?
How about

    function bakeCake() {
      return do(
        () => gatherIngredients(),
        ingredients => mix(ingredients),
        batter => pour(batter, pan),
        batterInPan => bake(batterInPan, 350, 45),
        () => coolOff(bakedInPan),
        cooledCake => separateFromPan(cooledCake)
      );
    }
...which is just

   function bakeCake() {
      const ingredients = gatherIngredients();
      const batter = mix(ingredients);
      const batterInPan = pour(batter, pan);
      bake(batternInPan, 350, 45); // this is an in-place modifying function, I guess
      const cooledCake = coolOff(bakedInPan);
      return separateFromPan(cooledCake); 
    }
...but with an extra `do(...)` wrapper?

It could at least be

    function bakeCake() {
      return do(
        gatherIngredients,
        mix,
        batter => pour(batter, pan),
        batterInPan => bake(batterInPan, 350, 45),
        coolOff,
        separateFromPan
      );
    }
Although if we had function currying, the convention in ML languages is to put the most-commonly-piped-in param last for these functions:

    function bakeCake() {
      return do(
        gatherIngredients,
        mix,
        pour(pan), // assuming that pour(pan) returns a function that pours something into that pan
        bake(350, 45), // assuming that bake(temp, minutes) returns a function that bakes something at that temperature for that time
        coolOff,
        separateFromPan
      );
    }
What this reminds me is of those hierarchies Cat extends Animal... In these simple "real-world-inspired" examples it seems to make sense, but in programming I'd say a lot of times there's simply no good name for the intermediate steps.
In general, I think that when one does that, the code smell one is smelling isn't "This language isn't expressive enough; I need a third way to describe calling a function." It's "What I'm doing is actually complicated and I need to switch to describing it with a DSL, not adding more layers of frosting on this three-layer cake."