Hacker News new | ask | show | jobs
by heydenberk 1631 days ago
Something that people may not see immediately is that flatMap is more general than map and filter. Say, for a contrived example, that you'd like to filter out the even numbers in an array, and then double the odd numbers that remain. Instead of:

  [1, 2, 3, 4, 5].filter(n => n % 2 === 1).map(n => n * 2)
You can do:

  [1, 2, 3, 4, 5].flatMap(n => n % 2 === 1 ? [n * 2] : [])
Again, this is a contrived example, but I think it's interesting since the generality is not obvious (to me)
10 comments

For those wondering at home, the reason you shouldn't do this is immediately spelled out in the Mozilla docs for flatMap:

> Note, however, that this is inefficient and should be avoided for large arrays: in each iteration, it creates a new temporary array that must be garbage-collected, and it copies elements from the current accumulator array into a new array instead of just adding the new elements to the existing array.

Sure. I'm not suggesting it be used to this effect; I'm noting the generality as an interesting point.
I assumed you were aware of the issues, which is why I said those at home who might just take it at face value as a neat pattern. :)

I personally do actually use flatMap sometimes as outlined in your example - perf is an after issue and I'll refactor if the concise code isn't worth the cost.

I just filed this issue on the MDN page: https://github.com/mdn/content/issues/11763

That note is misleading.

It still creates a temporary [x, 2*x] array for every element though. This is an unavoidable problem with flatMap, while reduce can easily be changed to reuse the same accumulator array, making it twice as fast as flatMap and almost as fast as the simple for-loop approach.
My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.
Not that I disagree with your preference, in fact I share it. But this:

> It's the most straightforward way other developers will understand what's going on.

… has been the opposite of my experience. Both on the job (where I’ve always conceded to team preference for imperative loops) and observing the community (hating on reduce is a whole meme on JavaScript/TypeScript Twitter, and the contrary meme has never shown up at least on my feed).

And I honestly understand why it’s not very popular. Reduce/fold is a very FP concept which isn’t particularly idiomatic in real world JS. When I learned and embraced it (myself coming from a JS background), it took dozens of real uses before I felt like I had committed to my own memory what’s actually happening. And by then I think I was writing Clojure.

> My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.

There are definitely operations that are more intuitive as reduces, but any sequences consisting of exclusively filter, map, and/or flatMap are, IME and IMO, about the worst candidates.

(That said, because the JS filter, map, etc. operations are eager rather than lazy, sequenced operations produce intermediate arrays that may be undesirable especially with large datasets, so reduce can be desirable even without being more clear in intent.)

In addition to being essentially a combined "filter" and "map", it's also a "better" filter than filter itself in TypeScript in such that it narrows types much more ergonomically[0].

In TypeScript, you might have an array of multiple types (e.g. `Array<A | B>`), and use a `filter` call to only keep the `A`s. However, in many situations TypeScript can't figure this out and the resulting array type is still `Array<A | B>`. However, when you just use `flatMap` to do nothing more than filtering in the same way, TypeScript can determine that the resulting type is just `Array<A>`. It's a bit unfortunate really - `filter` is faster and more readable, but the ergonomics of `flatMap` type-wise are so much nicer! Just some interesting trivia.

[0]: https://github.com/microsoft/TypeScript/issues/16069#issueco...

I wonder if it is possible to add a feature to Typescript to help with this:

You could potentially add a syntax for type guards function types, then add a signature to filter that accepts a type guard and returns an array of the guarded types.

Shouldn't be too much of a stretch given that we have type guards.

The syntax is a bit annoying... should be something like filter<A, B>(cb: A => A is B)

:/

You can use a type guard[1] as an argument to Array.filter, but the function has to be explicitly typed as such.

I don't know why the type isn't narrowed in Array.filter like it is in if statements without this weird workaround.

  const array: (number | string)[] = [];
  
  const mixedArray = array.filter(value => typeof value === 'string');
  // mixedArray: (number | string)[]
  const arrayOfString = array.filter((value): value is string => typeof value === 'string');
  // arrayOfString: string[]
This example in Typescript playground: https://www.typescriptlang.org/play?#code/MYewdgzgLgBAhgJwXA...

[1]: https://www.typescriptlang.org/docs/handbook/advanced-types....

Oh so there is an overload!

filter<U extends T>(pred: (a: T) => a is U): U[];

Additionally, getting TS better at inferring type guards is an open issue (literally): https://github.com/microsoft/TypeScript/issues/38390

You could do that, but I’d argue using filter then map is more readable. What do empty arrays have to do with doubling even integers?
The interesting generalization is that once you realize that flatMap lets you map and filter at the same time is that you can generate arbitrary elements in the output list corresponding to each item in the input list. For example,

    ls.flatMap(x => {
      if (x < 0) {
         return [] 
      } else if (x == 0) {
         return [0]
      } else {
         return [Math.sqrt(x), -Math.sqrt(x)]
      }
    })
gives you all the real square roots from the original list, doing the mapping, flattening, and filtering all in one function call.
(just for laughs)

  ls.filter(o=>0<o).map(o=>o||[Math.sqrt(x, -Math.sqrt(x)])
I agree on the readability. There's a TC proposal floating around for a pipeline operator. I don't think it's moved but that would be a game changer.
>What do empty arrays have to do with doubling even integers

nothing, but they do have some relationship to 0, "" and Promise.resolve() - the array is handling the logic that will make the results be combined, not the doubling part

Filter and then map will iterate the list twice. JS really needs some iterator-based methods like Lodash where it will only go through the list once in this case.
You've discovered transducers (which I think have a rather horrible and confusing for newcomers higher-order function presentation in the language that popularized them, i.e. Clojure, when they could just be lists)! All transducers are is a function `x -> List(x)` and then you can use other functions such as `flatMap` to apply them (as your example illustrates nicely this is why map, filter, and its combination can all be described as single transducers).

You do have to make sure that your implementation of list is extremely efficient on zero and one element lists (ideally it generates no garbage at all in those cases) otherwise as other commentators have pointed out you'll have a lot of GC pressure.

And even though the transducer itself is `x -> List(x)` note that the `List` is only produced as an intermediate step and doesn't need to exist in the final product. You could apply a `x -> List(x)` to a generator for example and just "absorb" the list back into the resulting generator.

I'm not sure it's any more general considering that you have to return an array and also treat an empty array as a 'null' value.

Or to put it another way, if I reviewed code where someone used flatMap for anything other than lists of lists I'd be likely to suggest filter/map or reduce or some other convenient equivalent depending on the purpose of the code.

Something like Ruby's filter_map[0] would do the job, although not with this particular example (because 0 is truthy in Ruby).

[0] https://ruby-doc.org/core-3.1.0/Enumerable.html#method-i-fil...

    /** Return this symbol to skip the current value. */
    const SKIP = Symbol("mapFilter.SKIP");
    
    /**
     * @template T, R
     * @param {T[]} array
     * @param {(SKIP: Symbol, currentValue: T, index: number, array: T[]) => R|SKIP} callback return `SKIP` to filter out an element
     * @param {number} [begin] defaults to 0
     * @param {number} [end] defaults to `array.length`
     * @returns {R[]}
     */
    function mapFilter(array, callback, begin = 0, end = array.length) {
      const ret = [];
      for (let i = begin; i < end; i++) {
        const v = callback(SKIP, array[i], i, array);
        if (v !== SKIP) ret.push( /** @type {R} */ (v));
      }
      return ret;
    };
Here. Less than ten lines without JSDoc type annotations, twenty with them. It lets you slice, map and filter all in one call without allocating intermediate arrays like you would when chaining them, making it almost as fast as a plain for-loop. It's also easy to turn it into an in-place version, removing even the array allocation overhead.
It is also much less readable.
If you are looking for really general and powerful, then there is the mighty reduce:

    [1, 2, 3, 4, 5].reduce((x, y) => y % 2 === 1 ? [...x, y * 2] : x, [])
The spread operator looks cool and makes just returning the ternary operator work here but its performance implications are non-obvious (it's makin' copies). With reduce() you're really wanting something like this:

    [1, 2, 3, 4, 5].reduce((x, y) => { if (y % 2 === 1) x.push(y * 2); return x; }, [])
I've many times wished that push() would just return the array, it would make reduce() far easier for this sort of use case.
I guess...

  x.concat([y*2])
would return the array (but makes a duplicate)

Anyway, I find this to be a whole lot more sensible:

  x=[];
  for(y of [1,2,3,4,5]){
    if(y%2===1)x.push(y*2)
  }
Or even!

  y=[1,2,3,4,5];
  x=[];
  // map reduce/flatmap/map/filter etc omg wtf
  for( i=0; i < y.length; i++ ){
    if( y[i]%2 === 1 ){ x.push( y[i] * 2 ); }
  }
I cant even tell what language this is but there is nothing here that needs fixing.
Or… use reduce.
Yes, but you could also use `fold`^H^H^H^H`reduce`.

[1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? acc.push(2*n) : acc, [])

Push returns the length of the array, though, so that won't work.
Comma operator to the rescue:

    (acc.push(2*n), acc)
In an expression position, the push statement will be executed, then its return will be discarded, and the final expression will be the result of the expression. Is it “better”? Almost certainly not. But it lets you stay in expression syntax while executing statements. (Much more useful for logging than meaningful runtime side effects IMO, but I think it should be more widely known in general.)

Edit: and I’m glad to see another reference to it down thread!

Ah, sorry, so you have to concat the arrays using `concat`.

    [1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? acc.concat([2*n]) : acc, [])
Wouldn't recommend doing this - if the original array is of significant length this'll get quite slow because `acc.concat` has to create a brand new array of slightly longer length on each iteration it's called. Better to just use `push` like you suggested before and then return the array if you want to use `reduce`.
Yes, of course, that's why I used `push` at first.
Use the comma operator: (acc.push(2*n), acc) will return acc. Or e.g.

  [1, 2, 3, 4, 5].reduce((acc, n) => (n % 2 ? acc.push(2*n) : null, acc), [])
Surely the spread operator is nicer here?

    [1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? [ ...acc, 2 * n ] : acc, [])
This is not efficient. Each iteration creates a new array instance due to the spread operator.
`acc.concat()` also creates a new array instance, so I don't get your point.
Hilariously enough, I think reduce would be much more palatable to JS devs if it was named fold. Not because the familiarity would jump out but because it’s a much more evocative name that could make visualizing its behavior more intuitive.
What is `f`reduce`?