| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by heydenberk 1631 days ago

Something that people may not see immediately is that flatMap is more general than map and filter. Say, for a contrived example, that you'd like to filter out the even numbers in an array, and then double the odd numbers that remain. Instead of:

  [1, 2, 3, 4, 5].filter(n => n % 2 === 1).map(n => n * 2)

You can do:

  [1, 2, 3, 4, 5].flatMap(n => n % 2 === 1 ? [n * 2] : [])

Again, this is a contrived example, but I think it's interesting since the generality is not obvious (to me)

10 comments

Vanit 1631 days ago

For those wondering at home, the reason you shouldn't do this is immediately spelled out in the Mozilla docs for flatMap:

> Note, however, that this is inefficient and should be avoided for large arrays: in each iteration, it creates a new temporary array that must be garbage-collected, and it copies elements from the current accumulator array into a new array instead of just adding the new elements to the existing array.

heydenberk 1631 days ago

Sure. I'm not suggesting it be used to this effect; I'm noting the generality as an interesting point.

Vanit 1631 days ago

I assumed you were aware of the issues, which is why I said those at home who might just take it at face value as a neat pattern. :)

I personally do actually use flatMap sometimes as outlined in your example - perf is an after issue and I'll refactor if the concise code isn't worth the cost.

hildjj 1631 days ago

I just filed this issue on the MDN page: https://github.com/mdn/content/issues/11763

That note is misleading.

edflsafoiewq 1631 days ago

It still creates a temporary [x, 2*x] array for every element though. This is an unavoidable problem with flatMap, while reduce can easily be changed to reuse the same accumulator array, making it twice as fast as flatMap and almost as fast as the simple for-loop approach.

Osiris 1631 days ago

My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.

eyelidlessness 1630 days ago

Not that I disagree with your preference, in fact I share it. But this:

> It's the most straightforward way other developers will understand what's going on.

… has been the opposite of my experience. Both on the job (where I’ve always conceded to team preference for imperative loops) and observing the community (hating on reduce is a whole meme on JavaScript/TypeScript Twitter, and the contrary meme has never shown up at least on my feed).

And I honestly understand why it’s not very popular. Reduce/fold is a very FP concept which isn’t particularly idiomatic in real world JS. When I learned and embraced it (myself coming from a JS background), it took dozens of real uses before I felt like I had committed to my own memory what’s actually happening. And by then I think I was writing Clojure.

dragonwriter 1630 days ago

> My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.

There are definitely operations that are more intuitive as reduces, but any sequences consisting of exclusively filter, map, and/or flatMap are, IME and IMO, about the worst candidates.

(That said, because the JS filter, map, etc. operations are eager rather than lazy, sequenced operations produce intermediate arrays that may be undesirable especially with large datasets, so reduce can be desirable even without being more clear in intent.)

jasonkillian 1631 days ago

In addition to being essentially a combined "filter" and "map", it's also a "better" filter than filter itself in TypeScript in such that it narrows types much more ergonomically[0].

In TypeScript, you might have an array of multiple types (e.g. `Array<A | B>`), and use a `filter` call to only keep the `A`s. However, in many situations TypeScript can't figure this out and the resulting array type is still `Array<A | B>`. However, when you just use `flatMap` to do nothing more than filtering in the same way, TypeScript can determine that the resulting type is just `Array<A>`. It's a bit unfortunate really - `filter` is faster and more readable, but the ergonomics of `flatMap` type-wise are so much nicer! Just some interesting trivia.

[0]: https://github.com/microsoft/TypeScript/issues/16069#issueco...

amitport 1631 days ago

I wonder if it is possible to add a feature to Typescript to help with this:

You could potentially add a syntax for type guards function types, then add a signature to filter that accepts a type guard and returns an array of the guarded types.

Shouldn't be too much of a stretch given that we have type guards.

The syntax is a bit annoying... should be something like filter<A, B>(cb: A => A is B)

:/

abrioy 1631 days ago

You can use a type guard[1] as an argument to Array.filter, but the function has to be explicitly typed as such.

I don't know why the type isn't narrowed in Array.filter like it is in if statements without this weird workaround.

  const array: (number | string)[] = [];
  
  const mixedArray = array.filter(value => typeof value === 'string');
  // mixedArray: (number | string)[]
  const arrayOfString = array.filter((value): value is string => typeof value === 'string');
  // arrayOfString: string[]

This example in Typescript playground: https://www.typescriptlang.org/play?#code/MYewdgzgLgBAhgJwXA...

[1]: https://www.typescriptlang.org/docs/handbook/advanced-types....

amitport 1630 days ago

Oh so there is an overload!

filter<U extends T>(pred: (a: T) => a is U): U[];

Additionally, getting TS better at inferring type guards is an open issue (literally): https://github.com/microsoft/TypeScript/issues/38390

niek_pas 1631 days ago

You could do that, but I’d argue using filter then map is more readable. What do empty arrays have to do with doubling even integers?

lalaithion 1631 days ago

The interesting generalization is that once you realize that flatMap lets you map and filter at the same time is that you can generate arbitrary elements in the output list corresponding to each item in the input list. For example,

    ls.flatMap(x => {
      if (x < 0) {
         return [] 
      } else if (x == 0) {
         return [0]
      } else {
         return [Math.sqrt(x), -Math.sqrt(x)]
      }
    })

gives you all the real square roots from the original list, doing the mapping, flattening, and filtering all in one function call.

6510 1631 days ago

(just for laughs)

  ls.filter(o=>0<o).map(o=>o||[Math.sqrt(x, -Math.sqrt(x)])

nefitty 1631 days ago

I agree on the readability. There's a TC proposal floating around for a pipeline operator. I don't think it's moved but that would be a game changer.

frozenlettuce 1631 days ago

>What do empty arrays have to do with doubling even integers

nothing, but they do have some relationship to 0, "" and Promise.resolve() - the array is handling the logic that will make the results be combined, not the doubling part

hajile 1631 days ago

Filter and then map will iterate the list twice. JS really needs some iterator-based methods like Lodash where it will only go through the list once in this case.

dwohnitmok 1631 days ago

You've discovered transducers (which I think have a rather horrible and confusing for newcomers higher-order function presentation in the language that popularized them, i.e. Clojure, when they could just be lists)! All transducers are is a function `x -> List(x)` and then you can use other functions such as `flatMap` to apply them (as your example illustrates nicely this is why map, filter, and its combination can all be described as single transducers).

You do have to make sure that your implementation of list is extremely efficient on zero and one element lists (ideally it generates no garbage at all in those cases) otherwise as other commentators have pointed out you'll have a lot of GC pressure.

And even though the transducer itself is `x -> List(x)` note that the `List` is only produced as an intermediate step and doesn't need to exist in the final product. You could apply a `x -> List(x)` to a generator for example and just "absorb" the list back into the resulting generator.

ljm 1631 days ago

I'm not sure it's any more general considering that you have to return an array and also treat an empty array as a 'null' value.

Or to put it another way, if I reviewed code where someone used flatMap for anything other than lists of lists I'd be likely to suggest filter/map or reduce or some other convenient equivalent depending on the purpose of the code.

Something like Ruby's filter_map[0] would do the job, although not with this particular example (because 0 is truthy in Ruby).

[0] https://ruby-doc.org/core-3.1.0/Enumerable.html#method-i-fil...

vanderZwan 1630 days ago

    /** Return this symbol to skip the current value. */
    const SKIP = Symbol("mapFilter.SKIP");
    
    /**
     * @template T, R
     * @param {T[]} array
     * @param {(SKIP: Symbol, currentValue: T, index: number, array: T[]) => R|SKIP} callback return `SKIP` to filter out an element
     * @param {number} [begin] defaults to 0
     * @param {number} [end] defaults to `array.length`
     * @returns {R[]}
     */
    function mapFilter(array, callback, begin = 0, end = array.length) {
      const ret = [];
      for (let i = begin; i < end; i++) {
        const v = callback(SKIP, array[i], i, array);
        if (v !== SKIP) ret.push( /** @type {R} */ (v));
      }
      return ret;
    };

Here. Less than ten lines without JSDoc type annotations, twenty with them. It lets you slice, map and filter all in one call without allocating intermediate arrays like you would when chaining them, making it almost as fast as a plain for-loop. It's also easy to turn it into an in-place version, removing even the array allocation overhead.

kaba0 1631 days ago

It is also much less readable.

newlisp 1631 days ago

If you are looking for really general and powerful, then there is the mighty reduce:

    [1, 2, 3, 4, 5].reduce((x, y) => y % 2 === 1 ? [...x, y * 2] : x, [])

femto113 1631 days ago

The spread operator looks cool and makes just returning the ternary operator work here but its performance implications are non-obvious (it's makin' copies). With reduce() you're really wanting something like this:

    [1, 2, 3, 4, 5].reduce((x, y) => { if (y % 2 === 1) x.push(y * 2); return x; }, [])

I've many times wished that push() would just return the array, it would make reduce() far easier for this sort of use case.

6510 1630 days ago

I guess...

  x.concat([y*2])

would return the array (but makes a duplicate)

Anyway, I find this to be a whole lot more sensible:

  x=[];
  for(y of [1,2,3,4,5]){
    if(y%2===1)x.push(y*2)
  }

Or even!

  y=[1,2,3,4,5];
  x=[];
  // map reduce/flatmap/map/filter etc omg wtf
  for( i=0; i < y.length; i++ ){
    if( y[i]%2 === 1 ){ x.push( y[i] * 2 ); }
  }

I cant even tell what language this is but there is nothing here that needs fixing.

nesarkvechnep 1631 days ago

Or… use reduce.

ReleaseCandidat 1631 days ago

Yes, but you could also use `fold`^H^H^H^H`reduce`.

[1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? acc.push(2*n) : acc, [])

christophilus 1631 days ago

Push returns the length of the array, though, so that won't work.

eyelidlessness 1630 days ago

Comma operator to the rescue:

    (acc.push(2*n), acc)

In an expression position, the push statement will be executed, then its return will be discarded, and the final expression will be the result of the expression. Is it “better”? Almost certainly not. But it lets you stay in expression syntax while executing statements. (Much more useful for logging than meaningful runtime side effects IMO, but I think it should be more widely known in general.)

Edit: and I’m glad to see another reference to it down thread!

ReleaseCandidat 1631 days ago

Ah, sorry, so you have to concat the arrays using `concat`.

    [1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? acc.concat([2*n]) : acc, [])

jasonkillian 1631 days ago

Wouldn't recommend doing this - if the original array is of significant length this'll get quite slow because `acc.concat` has to create a brand new array of slightly longer length on each iteration it's called. Better to just use `push` like you suggested before and then return the array if you want to use `reduce`.

ReleaseCandidat 1631 days ago

Yes, of course, that's why I used `push` at first.

jacobolus 1631 days ago

Use the comma operator: (acc.push(2*n), acc) will return acc. Or e.g.

  [1, 2, 3, 4, 5].reduce((acc, n) => (n % 2 ? acc.push(2*n) : null, acc), [])

spiralx 1631 days ago

Surely the spread operator is nicer here?

    [1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? [ ...acc, 2 * n ] : acc, [])

v413 1631 days ago

This is not efficient. Each iteration creates a new array instance due to the spread operator.

spiralx 1625 days ago

`acc.concat()` also creates a new array instance, so I don't get your point.

eyelidlessness 1630 days ago

Hilariously enough, I think reduce would be much more palatable to JS devs if it was named fold. Not because the familiarity would jump out but because it’s a much more evocative name that could make visualizing its behavior more intuitive.

pwdisswordfish9 1631 days ago

What is `f`reduce`?