Hacker News new | ask | show | jobs
by chrismorgan 429 days ago
Sorry, but your comment is completely, completely wrong.

> .slice[0] does not allocate, nor does .filter[1], only map does.. so one allocation.

This is simply not true. I presume you’re misunderstanding “shallow copy” in the MDN docs; it’s pretty poor wording, in my opinion: it means shallow copies of the items, it’s not about the array; it’s not like a live collection, they all do create a new Array.

Array.prototype.slice is specified in a way that must allocate for the array, but since it’s a contiguous slice it could also be implemented as copy-on-write, only allocating a whole new chunk of memory to back the array when either array is modified and so they diverge. I don’t know for certain what engines do, but my weak guess is that browser engines will all behave that way. I’m fairly sure that for strings they all work that way. (Aside: such an optimisation would also require that the slice() caller be an Array. Most of the array methods are deliberately designed to work on array-like objects: not just Arrays, but anything with a length property and numbered properties. One fun place to see that is how NodeList.prototype.forEach === Array.prototype.forEach.)

But Array.prototype.filter must allocate (in the general case, at least), because it’s taking bits and pieces of the original array. So the array itself must allocate.

Array.prototype.map similarly must allocate (in the general case), because it’s creating new values.

Then, when we’re talking about allocation-counting, you have to bear in mind that, when the size is not known, you may make multiple allocations, growing the collection as you go.

Rust’s equivalent of Array, Vec, starts with a small allocation, which depends on the specific type being stored but we’ll simplify and call it 8, and then when you try to add beyond that capacity, reallocates, doubling the capacity. (This is the current growth strategy, but it’s not part of any contract, and can change.)

A naive implementation of JavaScript backed by such a growth strategy would make one exact-sized allocation for slice(), approximately log₂ N allocations for filter() where N is the number of retained elements, and one exact-sized allocation for map().

> > arr.values().drop(10).take(10).filter(el => el < 10).map(el => el + 5).toArray()

> Allocates once for .values[2], and again for .toArray[3].. there's decreased efficiency here.

It’s generally difficult talking about allocations in a GC language in details like this, but in the way you tend to talk about allocations in such systems, .values() can be assumed not to allocate. Especially once you get to what optimisers are likely to do. Or, at the very least, drop(), take(), filter() and map() all allocate just as much, as they also create iterator objects.

—⁂—

> > Swapping variables

> Only do this if you don't care about performance (the advice is written like using the array swap hack is categorically better).

My own hypothesis: any serious JS engine is going to recognise the [a, b] = [b, a] idiom and optimise it to be at least as good as the temporary variable hack. If you’re going to call the array swap a hack, I can call temporary variables a hack—it’s much more of a hack, far messier, especially semantically. The temporary variable thing will mildly resist optimisation for a couple of reasons, whereas [a, b] = [b, a] is neatly self-contained, doesn’t leak anything onto the stack, and can thus be optimised much more elegantly.

Now then the question is whether it is optimised so. And that’s the problem with categoric statements in a language like JavaScript: if you make arguments about fine performance things, they’re prone to change, because JavaScript performance is a teetering stack of flaming plates liable to come crashing down if you poke it in the wrong direction, which changes from moment to moment as the pile sways.

In practice, trivial not-very-careful benchmarking suggests that in Firefox array swap is probably a little slower, but in Chromium they’re equivalent (… both quite a bit slower than in Firefox).

2 comments

You're right.. all of these functions require more memory. Allocate is wrong... let's use shallow vs deep (significantly different expense). All pointers use a bit more memory than the original, shallow copies use more, but only the size of a primitive (best case) or pointer to an object (worst case.. often the same size), deep can use much, much more.

> arr.slice(10, 20).filter(el => el < 10).map(el => el + 5)

  slice  = shallow
  filter = shallow
  map    = deep
(2s+d)

As you point out, many engines optimise shallow with copy on write (zero cost).. so just 1 allocation at map.

> arr.values().drop(10).take(10).filter(el => el < 10).map(el => el + 5).toArray()

  values = deep
  drop   = shallow
  take   = shallow
  filter = shallow
  map    = deep
  toArray= deep
(3s+3d) .. significantly worse performance

Note the shallow copy:

  const arr=[{a:{b:"c"}},{d:"e"}];
  const [s]=arr.slice(0,1);
  s.a.b="f";
  console.log(arr);//=[a:{b:"f"},{d:"e"}]
You’re counting completely the wrong thing. Shallow versus deep is about the items inside, but we care about the costs of creating the collection itself. As far as structured clones are concerned, none of the operations we’re talking about are deep. At best, it’s just the wrong word to use. (Example: if you were going to call it anything, you’d call .map(x => x) shallow.)

Array:

• Array.prototype.slice may be expensive (it creates a collection, but it may be able to be done in such a way that you can’t tell).

• Array.prototype.filter is expensive (it creates a collection).

• Array.prototype.map is expensive (it creates a collection).

So you have two or three expensive operations, going through as much as the entire list (depends on how much you trim out with slice and filter) two or three times, creating an intermediate list at each step.

Iterator:

• Array.prototype.values is cheap, creating a lightweight iterator object.

• Iterator.prototype.drop is cheap, creating a lightweight iterator helper object.

• Iterator.prototype.take is cheap, creating a lightweight iterator helper object.

• Iterator.prototype.filter is cheap, creating a lightweight iterator helper object.

• Iterator.prototype.map is cheap, creating a lightweight iterator helper object.

• Iterator.prototype.toArray is the thing that actually drives everything. Now you drive the iterator chain through, going through the list only once, applying each filter or transformation as you go, and only doing one expensive allocation of a new array.

In the end, in terms of time complexity, both are O(n), but the array version has a much higher coefficient on that n. For small inputs, array may be faster. For large values, iterators will be faster.

> My own hypothesis: any serious JS engine is going to recognise the [a, b] = [b, a]

This. As the author of this blog I actually run a benchmark, the loop body was only doing swap, I remember the penalty was around ~2%. But yeah, if it's a critical path and you care about every millisecond, then sure, you should optimize for speed, not for code ergonomics.