Hacker News new | ask | show | jobs
by deredede 689 days ago
I mostly agree with your sentiment but this:

> Well, no, this is straight confusion between what’s expressed by the program and what’s compiled. The idiomatic code in Ocaml will end up generating machine code which is as performant than using mutable array.

I disagree with. There are different ways to get close to the performance of `Array.map` with lists (best case scenario you don't care about order and can use `List.rev_map`), but you will always have memory (and GC) overhead and so lists are strictly inferior to arrays for the presented use case.

1 comments

That’s not what the article is talking about. The proposed exemple is a traversal of a different data structure to collect results in an array. That’s a fold and will properly be tco-ed to something equivalent to adding to an array if you use list cons in the aggregation, might actually be better depending on how much resizing of the array you have to do while traversing.
Works if you are building one list, but what if you are building multiple? What's suggested on the OCaml site and what's taught in most of academia is to use a recursive function with accumulator arguments that are reversed before returning to make the function tco:able. I doubt OCaml can optimize that pattern well, but idk.
You can benefit from TCO while building multiple lists.

  let rec f evens odds = function
    | [] -> (evens, odds)
    | x :: xs  ->
      if x mod 2 = 0 then f (x :: evens) odds xs
      else f evens (x :: odds) xs
OCaml optimizes this fairly well and will compil it down to a single loop with two variables. If you reverse the lists there is going to be additional loops (and corresponding allocations).

However OCaml also provides the "tail mod cons" optimization that allows to get some of the benefits of TCO without needing to reverse the list (this is implemented as a program transformation that uses mutability under the hood), and that one will only work if you are building a single list.

I think HN ate some characters because that code doesn't look valid. But yeah, that's how you do it. In my opinion it is not pretty (e.g., what if you have some mutable context?). I also don't see how OCaml could turn the cons operations into dynamic array appends.
> I think HN ate some characters because that code doesn't look valid.

The OCaml compiler disagrees with you ;)

It also won't ever turn a cons operation into a dynamic array append.

I think `Array.map` is a perfectly reasonable reading of "you're iterating over some structure and collecting your results in a sequence".

But sure, in the `fold` scenario where you don't know the number of results in advance (you are more likely to know if you use imperative data structures, e.g. `Hashtbl.length` is constant-time whereas `Map.cardinal` is not), lists might be faster than growing arrays with copies. They are still going to use more memory, and they are unlikely to to be faster than a rope-like structure that grows with no copies.

It isn’t. There’s no guarantee that .map will be processed in sequence. In fact, .map is usually a great candidate for parallelization.
The "sequence" in the problem statement does not refer to the order of operations but to the data structure storing the results.

A parallel `Array.map` still computes a sequence, even though it may not compute in sequence.