Hacker News new | ask | show | jobs
by ketzo 2015 days ago
Is it crazy if I think that first example is the most readable of the three?

In fairness, I've done a lot of work in TS and exactly none in Rust, so this is totally biased, but at the very least, it seems like all you're getting in the next examples is two fewer lines of code, in return for assuming that the reader is familiar with 1) `Option<x>` 2) `.ok\_or` 3) the syntax of the third block which I don't remotely get.

Genuine question, how comparable are these things to understanding "File | null" in TS, which I would consider day 1 learning?

6 comments

> Is it crazy if I think that first example is the most readable of the three?

IMO, sort of!

All abstraction requires you to understand it at some level before you can quickly reason about it in code. But once you do, it allows you to reason about things at a higher level, rather than at a level where you have to focus on each detail individually. This is a net win for good abstractions that are (generally) simple and minimally leaky, but it can be a net loss for abstractions that are complicated.

You see this same conversation play out with functional looping constructs vs. imperative ones. Which is more readable?

    for (int i = 0; i < a.len(); i++) {
        a[i] = 0;
    }
    
    // vs.
    
    a.map! { 0 }
If you don't know what `map!` does, the former. And many people argue for this for the sake of "simplicity".

But when you understand functional iteration—which is generally a simple, non-leaky abstraction—the latter wins by a mile. And while you might look at this example and think "you're just saving a few lines of code", the latter not only completely eliminates entire classes of problems (off-by-one errors, slow calculation of `len()` for e.g., NUL-terminated strings, etc.) but knowing it also unlocks a bunch of additional useful tools like `reduce`, `filter`, and friends that reduce reams of boilerplate throughout your code while dramatically improving comprehensibility.

The same is true of `Option<T>` and `Result<T>`. They're wildly powerful and allow for rapid understanding of code without having to read and parse if-else branching to confirm that the logic is performing null checking or error handling (and more importantly, doing it correctly).

So you're not crazy for thinking the first example is the most readable of the three given your current knowledge. But you are crazy if you think the first example is preferable to learning the relatively simple abstractions of Option<T> and Result<T> which allow reasoning about things at a higher level and unlock extremely powerful tools in doing so.

A counterpoint: Option and Result are hard to read and unpleasant to work with. They are so annoying, that the language designers extended the language itself to make them tolerable - do notation in Haskell, '?' operator in Rust, guard-let in Swift.

A plain old for loop has so much to recommend it. You get powerful control flow constructs (break/continue/return) and obvious performance characteristics.

Can you tweak your function to stop filling the first time a zero is encountered? That's a simple one-line change with the for loop, but a puzzle for the functional iteration.

> A counterpoint: Option and Result are hard to read and unpleasant to work with. They are so annoying, that the language designers extended the language itself to make them tolerable - do notation in Haskell, '?' operator in Rust, guard-let in Swift.

Option and Result are tolerable without language extensions (as long as your language has first-class functions and parameterized types, which you want anyway) - https://fsharpforfunandprofit.com/posts/recipe-part2/ . do notation is a small, purely-syntactic piece of sugar that's usable for many different cases, not just error handling.

> A plain old for loop has so much to recommend it. You get powerful control flow constructs (break/continue/return) and obvious performance characteristics.

Those are all language extensions! You're talking about adding four keywords to the language, none of which are anywhere near as reusable as do notation.

> Can you tweak your function to stop filling the first time a zero is encountered? That's a simple one-line change with the for loop, but a puzzle for the functional iteration.

Different code should look different. map, reduce, fold, traverse, foldM are all different functions that do different things, but they're easy to work with because they're all normal functions that obey the rules of functions (and if you're ever confused you can just click through to the implementation in plain old code). Languages don't want to offer several different variants of "for" because it's a language keyword that has to be supported at the language level, but the result is that the "for" loop is far from simple - it does several different things depending on how exactly it's used, and you can't tell which except by going through the details every time.

Fair points - Haskell is all-in on these ideas, and agreed that do-notation has power well beyond Result and Option. I regret including do-notation in my critique, maybe list comprehensions instead.

I think we disagree on what "different code" ought to mean. I have a C function which multiplies a list of numbers; I make it immediately `return 0` if it hits zero. In C that's the same function, just optimized; in Haskell it's a breaking API change due to laziness. I suppose the languages reflect that difference.

I don't think it's about laziness; 99% of the time bailing out of a list operation early vs processing the entire list is a semantic difference that I want to be able to see when I'm reading the code. If what you want to do is really and truly just a performance optimization then the language runtime should be able to do it.
A for loop's greatest strength is its greatest weakness: you can do anything, including in most languages mutating the index variables (god forbid..!), deeply nesting with mutable data, and so on. I usually find I can understand a map quicker than a loop, because a map requires that the code be simpler in most cases.

Obviously in the trivial case though a loop is very easy to understand. The trivial case usually, in my experience, involves iterating through the entirety of an array though anyway, in which case the map is usually more concise.

YMMV! :)

Yeah, trivial cases are trivial everywhere.

I like your strength/weakness observation. But if you are mutating index variables, you have a hard case, and it probably will be easier to express with a manual loop than trying to shoehorn it into awkward functional constructs.

An example is the "discard elements by a predicate" function, aka Vec::retain in Rust, std::remove in C++. Rust implements this using a for loop. Maybe it can be done with functional constructs, but it would be harder to write and to understand.

https://doc.rust-lang.org/src/alloc/vec.rs.html#1105

But Vec::retain is pretty close to one of the fundamental functional constructs for containers (it's the in-place version of filter). The argument here is that others should use functions like retain instead of re-implementing that nasty indexing logic. But I don't think it should be surprising that array-like containers are going to have to involve some indexing logic at some level.
Most languages have something like:

    for (ElemType x : collection) {
        # use x here
    }
or:

    for (elem : collection) {
        // use here
    }
for dynamic languages or statically typed ones with lots of inference

Not sure what you can do with this that:

    collections.each(x -> {
        // use here
    })
can't.
Those look like Java loops, and I don't recall them being much more than sugar over a for loop (although presumably they're implementing some Iterable class and apply to more than just arrays).

Either way, like most things in functional programming, it's often about restricting your functions so that they're easier to reason about. There's nothing special about the map function really in any pragmatic sense except when used in conjunction with the other features a good language affords.

I'm also curious if you think something like:

  listOfListsmap : List (List Int) -> List (List Int)
  listOfListsmap =
    (List.map << List.map) (\n -> n + 1)
is easier to understand at a quick glance than a rough equivalent using those loops above:

  collections.forEach(x -> {
    // anything could happen here to x before the inner loop   processes it
    // and since we're probably dealing with mutable variables, the inner loop can
       presumably access things I put here (which may or may not be a problem, but is something you have to think about)
        x.forEach(y -> {
        // do something with y, we can do anything with x *and* y here
        y = y + 1;
        }
    }
knowing that << is the composition function.
Can you tweak your function to stop filling the first time a zero is encountered? That's a simple one-line change with the for loop, but a puzzle for the functional iteration.

Yes, and that's exactly why you should use constructs like map when possible. With a manual loop you have to scan more code to verify that it's not doing something more complicated.

To drive parent's point home a bit, I skimmed both examples above when reading and without looking again I am very certain that the map example does not have any weird iteration semantics (eg "stop filling the first time a zero is encountered"), but I'm not sure that the for loop example is similarly 'normal' -- I'd have to check the condition again more carefully.
> Can you tweak your function to stop filling the first time a zero is encountered? That's a simple one-line change with the for loop, but a puzzle for the functional iteration.

In Rust:

  a.iter_mut().take_while(|i| *i == 0).for_each(|i| *i = 0)
And it's as fast as hand-written for loop.
A counter-counterpoint: classic for loops are hard to read and unpleasant to work with. They are so annoying, that the language designers extended the language/standard libraries to make them tolerable - array.map in ECMAScript 5, for (x : y) in Java 5 and C++11, foreach in PHP 4, LINQ extensions in C# 3.

A plain old functor/monad has so much to recommend it. You get a uniform way to iterate/transform all kinds of collections, even those which don't support indexing. They can also be used without mutation and are easy to typecheck. In languages with higher-kinded types they can be abstracted over without knowing anything about how the underlying data is structured.

> A plain old for loop has so much to recommend it. You get powerful control flow constructs (break/continue/return) and obvious performance characteristics.

I spent 3.5 hours with a junior team member yesterday refactoring break and continue out of her loops. The code was so much more readable afterwards.

In my experience `map` and `filter` are easy to grok for even for people unexperienced with FP, but `reduce` is always a headache to parse even if you know how to read it. A loop that accumulates some result is almost always better.
"reduce" (or fold) certainly takes a while getting used to, but once you gain some intuition for it, it is pretty reusable and it avoids all the problems mentioned above such as off-by one errors etc.

also, if you use map + reduce, it's totally trivial to parallelise the map part and have the main thread combine the partial updates since there is no shared mutable state, e.g. Java also offers parallel streams that support this with basically zero effort. if you use loops, it can be quite hard to rewrite the code so that it is thread safe.

I usually like the FP style better, however that's not the best example. You're using a for loop rather than a foreach loop - admittedly C doesn't have foreach, but most other imperative languages have some variation of it, even BCPL had it.

foreach/map have the advantage of working based on array size information, while for needs to be told. That gives the FP version an unrelated advantage.

Switch to foreach, and none of the 'classes of problems' you mention apply to the imperative version of the code.

It is the most readable to you because you are used to null. If you're reading Rust, then it's a pretty good assumption that you'll be used to Option<x>.

Null is sloppy because it is the only way in many languages to express something like a union type (in Rust terms, an "enum"). So it gets overloaded to convey information that would be more accurately conveyed by a union type.

The Rust equivalent of null, Option, is just another enum and you can handle it with the standard enum tooling the language gives you such as pattern matching. It also makes you stop and think about what you are doing if you're returning it. In most cases, your intent can be more clearly expressed with an enum other than Option.

This contrived example would be less likely to appear in Rust. Why is there a function called parseFile taking something that might not even be a file?

As an aside, the Rust code could also be written like this:

        fn parse_file(file: Option<File>) -> Result<(), String> {
            if let None = file {
                return Err("File must exist".into());
            }
            // ...
        }
> As an aside, the Rust code could also be written like this

Sadly it can't really, because Rust doesn't have control-flow based type refinement like TypeScript.. In TS the type of `file` after the throw is File. In Rust it stays `Option<file>` so you would have to unsafely unwrap it below the if block.

(Unwrap is not unsafe, and in fact, in this code, you would even know that it can never panic. That being said, you're not wrong that this is nicer in TS. You could invert the condition and add an else and it would be not too terrible. I'd still probably go for a ? style.)
Rust’s if-let construct handles that in a nice, explicit, and compact way.
Why use `if let None = file` instead of `if file.is_none()`?
I realize the shadowing might make the Rust example unnecessarily confusing. With overly pedantic names, it might look like:

    fn parse_file(file_from_input: Option<File>) {
      let file = file_from_input.ok_or(Error::new("File must exist"))?;
    }
What I like about the Rust version is that it explicitly unwraps the argument and assigns it to a new variable. In the TypeScript one, the if statement allows the inference algorithm to determine that `file` is a `File` and not a `File | null`. That's a testament to the TypeScript team's efforts, but it's a little less ergonomic (in my view) that variables can change their type without getting mutated or changed in any way.

For instance, if I were to open up this file in emacs with no language server, nothing. I'd have to trace over the file and act like the TypeScript checker, thinking "oh okay so this null check ensures that file cannot be null, therefore it's inferred as File". This is clearly simple, but for other cases it's not as easy. Whereas with the Rust code, I know that my argument, file_from_input is an Option<File>. file_from_input.ok_or(Error::new(..)) makes it a Result<File, Error>. The (?) macro makes it a File. Each step produces a consistent type. At no point do I have to understand the inference algorithm to determine what the type may be.

That said it's totally cool if you find the TypeScript version more readable :D. It's not my place to say what's readable or not readable to you.

Agree. It's more about the usefulness of moving to the non-Optional type and the null safety in following lines, rather than the readability of the ok_or.
The former code is more equivalent to:

  let file = match file {
    Some(file) => file,
    None => panic!("file was null");
  };
Which you'd never write in practice since `expect` exists
Can you say more here, I don't have the context.
This code is identical to

  let file = file.expect("file was null");
That is, the "expect" method does exactly this.
I like Rust from what I've tried. But I'd have preferred something like this:

  let file = file.unwrapOrPanicWith("file was null");
option.expect(message) doesn't semantically make sense.
There was some discussion about this when the method was named; you're not alone. In the end, it is what it is. You could define your own method like that if you really wanted to.
I read it as expect (prepare) for the worst!
> Is it crazy if I think that first example is the most readable of the three?

No. That's simply the difference between imperative and functional code. If you don't know the latter you could never understand this.

For the most part, the last example is equivalent to `let result = file && parse(file);`, but there are important differences to consider in JavaScript at runtime.

Interesting, okay, makes sense. Thanks for the explanation.
The block syntax is comparable to understanding File | null - it's day 1 learning in Rust. The part I'd say you're missing is that Option is just a plain old type and ok_or is just a plain old function, so if you don't know what they're doing you can always just click through to their definitions (which are in normal plain old code) and read them. Even if you don't do that, you know that they're not doing any magical control flow things. Whereas to understand the first example I have to understand what "if" does and how having a bunch of statements after each other works, and there's nothing to tell me that anywhere.