Hacker News new | ask | show | jobs
by bjoli 2005 days ago
I doubt it will bring much. If properly implemented, there is nothing that makes generator-like laziness slower than transducers, and since it is pretty central to clojure I doubt you will see much speed gain by using transducers.

In scheme, the srfi-158 based generators are slower than my own transducer SRFI (srfi-171) only in schemes where set! incurs a boxing penalty and where the eagerness of transducers means mutable state can be avoided.

Now, I know very little clojure, but I doubt they would leave such a relatively trivial optimization on the table. A step in a transducer is just a procedure call, which is the same for trivial generator-based laziness.

2 comments

When I was reading the article, I thought the author of the post was probably pointing more in the direction of Clojure's immutable data being slower, rather than laziness specifically.

IME (admittedly in a different context, doing UI development) Clojure's seqs and other immutable data can be a huge performance drag due to additional allocations needed. If you're in a hot loop where you're creating and realizing a sequence immediately, it's probably much faster to bang on a transient vector. Same with creating a bunch of immutable hash maps that you then throw away; better to create a simpler data structure (e.g. a POJO or Map) which doesn't handle complicated structural sharing patterns if it's just going to be thrown away.

Transducer's would help in the author's first case to take the map/filter piped through the `->>`, which is going to do two separate passes and realize two seqs, and combine it into one.

I stand by what I said (even though I am partial to transducers): there is no reason for lazy sequences overhead to be much more than a procedure call, which is exactly what a step in the reduction in the case of transducers is. At least when implemented as in clojure or srfi-171.

I understand that there might be some overhead to creating new sequence objects, but removing intermediate sequence object creation should be simple, at least for built in map-filter-take-et-al.

Edit: I seem to be wrong. People are recommending transducers over clojure's lazy map-filter-take-et-al because of overhead, which to me seems off, but there might be something about clojure's lazy sequences that I haven't grooked.

I probably didn't make it any more clear in my reply. Transducers don't win based on allocations IME but because it removes the number of iterations required.

Take the case: (->> (for ,,,) (map ,,,) (filter ,,,))

`for`, `map` and `filter` will all create lazy sequences, holding a reference to the previous one. The problem here is when the `filter` seq gets realized, it will first realize the `map` seq, which will realize the `for` seq. Each sequence will wait for the next one to realize before iterating over it. So in this case it will iterate twice; once for the `map`, and then again for the `filter`.

As you know, transducers combine these steps together so that you only iterate over the initial collection once.

My other comment was making the point that the author has conflated "laziness" with "immutable data" AFAICT. The lazy seqs in the first example they give will be slower w/o transducers, but the other problem is that the overhead from all the allocations required for creating a bunch of immutable hash maps that are then destroyed immediately after is also non-negligible, and seems to be a source of the authors performance problems.

I think what bjoli is getting at is that in an ideal world with lazy sequences you don't have any iteration over any of the intermediate "collections" until some step occurs that requires the entire collection. I put collections in quotes because they aren't really collections, they're generators that produce an element on demand.

So you never really iterate over more than one collection; you only have one iteration where you successively ask each generator to produce a single element and then apply a function to it. For example, if you only asked for the first element of a lazy sequence formed by a series of maps, you would (in theory, in practice see my note about chunks) never iterate over any of the intermediate sequences and only ever examine the first element of each of them.

However, the act of asking a generator to produce an element (that is unwrapping a thunk) has overhead of its own and that's the overhead that a transducer would be removing (not iteration itself in the case of lazy sequences). This can have far more overhead than a procedure call because of bad cache locality (in the absence of fusion optimizations you're pointer chasing and potentially generating garbage for each thunk). Clojure tries to get around that by often (but not always) chunking its thunks, in which case we do have multiple rounds of iteration on smaller chunks, but never the entire sequence (unless the entire sequence is smaller than a chunk).

What I am saying is that lazy sequences in my world should mean you don't have to realize any intermediate collections. In the case of the srfi-158 generator (gmap square (gfilter odd? (list-generator big-list))) the overhead for getting one element from the generator would be 3 procedure calls. Without any intermediate steps. The same transducer would have one procedure call less, but would be in the same vicinity.

Does clojure's sequences not work similarly? That seems like a very lax definiti

Sounds like I was mistaken. What I was seeing in my tests was due to chunking, not realizing the whole seq.
I mean Clojure is still running on JVM so there will be at least that difference. AFAIK Clojure is slower than Java so there's that also.
Yes and no. There's real production cases where even default Clojure can result in the same or faster performance than Java. One hypothetical case is a system that does a lot of complex in-memory reads and a few writes to persistent data structures. That kinda of system could be faster than the Java equivalent, out of the box, in Clojure.

A write-heavy system would benefit Java mutable collections out of the box. Clojure can get pretty close to Java with transients and a good dose of type hinting in all the right places.

When we say "faster" or "slower" it's equally important to specify "faster" or "slower" when and where. It's a complex question with no easy answer.

> One hypothetical case is a system that does a lot of complex in-memory reads and a few writes to persistent data structures. That kinda of system could be faster than the Java equivalent, out of the box, in Clojure.

Why would Java be slower in a read-mostly regime? Your hypothetical is not convincing. Btw, you mention "real" and then move on to "hypothetical" as an example.

Are there actually OSS "production" cases of this subset of systems where Java lags behind Clojure in performance?

> When we say "faster" or "slower" it's equally important to specify "faster" or "slower" when and where. It's a complex question with no easy answer.

These sort of subtle distinctions only matter to langauge wars and debates like this. For actuals systems that need to do work and need to be maintained, we can in fact have metrics on efficiency.

That said, fundamentally, Java affords much greater facilities to "optimize" and approach white hot performance than Clojure.

> Why would Java be slower in a read-mostly regime? Your hypothetical is not convincing. Btw, you mention "real" and then move on to "hypothetical" as an example.

Because of Clojure's default persistent in-memory data structures which allow you to obtain a stable reference to your in-memory data even when the data is being updated live. With Java's default mutable data structures, you'd have to use locking or copying to obtain a stable reference, not to mention the huge complexity of those solutions. I said real because I've built similar solutions in Clojure. Substitute "hypothetical" with "example", sorry for the word confusion.

> That said, fundamentally, Java affords much greater facilities to "optimize" and approach white hot performance than Clojure.

Taken to the extreme, the resulting Clojure would not be idiomatic, it would be effectively Java-in-Clojure, but Clojure has all the facilities that Java has, by definition.

In read-mostly regimes, the performance considerations are typically systems-level consideration. Data locality, page swaps, and cache-line misses are typical concerns of high performance systems engineering.

Regarding "complexity", not sure what you mean. IIRC, someone, possibly even me ;), may have pointed out to Rich Hickey in the early days that Java code could also use those (Java or was it Scala) STM libraries. So, there is your "complexity" behind an API, just like Clojure.

> Clojure has all the facilities that Java has, by definition.

Ergo, a situation where Java can -not- be made as fast or faster than Clojure code seems unlikely.

I think all we have here is you claim of "complexity". I remember Rob Pike making a comment along the lines of "prefer libraries over language semantics" or something like that in context of Go language design. I find it a very compelling argument, based on my overall experience to date.