Hacker News new | ask | show | jobs
by joshlemer 1009 days ago
I think there's a big difference between transducers work and how other lazy stream libraries work. Not too familiar with LINQ but more with Java streams and Scala views and iterators. Usually these libraries wherever the work is forced, like in a final call of `.toList()` or whatnot, they are materializing something like a stack of iterators. Each iterator calls its child/children iterators and the pipeline logic is in the stack of `.next()` calls.

The difference with transducers is that the transformation logic is completely decoupled from the fact that this is happening in collections. That is what makes them usable outside of the context of collections, most famously in core.async where you can map/filter/mapcat/flatten/take/drop channels just like you do with collections. This only works because transducers are completely decoupled from either the fact that elements are coming from, or going into, a collection. I think it is a really fascinating and creative achievement in decoupling. Java Streams and Scala view/iterator/iterable transformations can never be reused in other contexts since they are fundamentally about the collections. Whatever kind of Observable/Async Stream/Future or other places where you might want map/filter/reduce functionality has to reimplement the whole set of these operations anew (see for example Akka Streams, RxJava)

1 comments

I think these languages use “Collection” as another way of saying “iterable”. So the only thing such an object needs to provide is a “next()”. So a stack of lambdas/generators all working on a “next()” seem to be doing the same “per element” work of a Clojure transducer. Am I misunderstanding?
The difference is in the way that the transformations compose.

Iterators are generic over some iterable type `U` in `E, R, T: Iterator<E, R, U|T>, U: Iterable<E>` that they either directly or indirectly capture some nested Iterator.

However that means that code composing iterators must also be generic over that iterable `U`, so you can't simply take two transformation pipelines and concatenate them, because they will both provide an element source already.

Transducers separate the transformation part and the processing part. So you can have a `E, R, T: Transform<E, R>` which doesn't have a generic type parameter for the Iterable, and a function `transduce<E, R, S: Source<E>, I: Sink<R>, T: Transform<E, R>>(source: S, tx: T) -> I` which contains all the transformation application machinery, be it iterator style `next()` operations, or stream based `pull/push` operations, and a function `comp<E, R, S>(left: Transform<E, R>, right: Transform<R, S>) -> Transform<E, S>` that composes transformations. The way transformers are implemented, this `comp` operation is actually simply function composition.

Oh I see, so with transducers, you can use the same transform pipeline (stack of functions) across different composition contexts (iterable next(), function composition, channel transforming, etc). I see the utility now!

Still I think just the “next()” context is quite enough because you can have everyone use it by convention for everything, even things that aren’t collections. Like a fluent tensor library for example. This is based on a quick understanding of transducers…

Yes, but don't forget that you can't compose `next` based iterators. That might be a small limitation, but it means that you can't do something like.

  ```js
  let pipelineA = map(i => i+1); // or makeComplexPipelineA();
  let pipelineB = filter(i => i.isEven()); // makeComplexPipelineB();
  
  let pipeline = pipelineA.combine(pipelineB);
  
  let result = [1,2,3,4,5].transform(pipeline);
  ```
I see, thank you! I hadn’t realized the hidden limitation in the usual iterator/generator/streams concepts in modern languages. It looks like the thing Linq and Java Streams have in common is that they’ve been added later, with a pragmatic focus on collections. They “bake in” iterator/next based composition. If you start from a more principled functional foundation, the limitation probably seems more jarring (why fix one “template parameter”?)

It also reinforces my view that you should try to separate logic and control flow as much as possible. At work, we use awful callback chains. There must be a better way to express the logic and hide the callback logic, even in C++ (lots of rope to … find alternative solutions.)