Much Clojure code relies on applying nested transformations to sequences: (reduce + 0 (filter odd? (map inc (range 1000))))
Conceptually, this takes an input sequence of 1000 elements (range 100), then creates an intermediate sequence of (map inc), then creates an intermediate sequence of (filter odd?), then reduces that with +. (I am glossing over a number of internal optimizations - chunking and transients, but the idea holds.)Transducers let you represent the transformation parts as an independent (reusable) thing: (def xf (comp (map inc) (filter odd?)))
You then apply that composite transformation in a single pass over the input: (transduce xf + 0 (range 1000))
transduce combines (a) input iteration (b) transformation application and (c) what to do with the results - in this case apply + as a reduce. Other functions also exist that make different choices about how to combine these parts: into collects results into a collection, sequence can incrementally produce a result, eduction can delay execution, core.async chans can apply them in a queue-like way.There are a number of benefits here: 1. Composability - transformations are isolated from their input and output contexts and thus reusable in many contexts. 2. Performance - the sequence example above allocates two large intermediate sequences. The transduce version does no intermediate allocation. Additionally, some collection inputs are "internally reducible" (including range in 1.7) which allows them to be traversed more quickly via reduce/transduce contexts. The performance benefits of these changes can be dramatic as you increase the size of input data or number of transformations. If you happen to have your data in an input collection (like a vector), into with transducers can be used to move data directly from one vector to another without ever producing a sequence. 3. Eagerness - if you know you will process the entire input into the entire output, you can do so eagerly and avoid overhead. Or you can get laziness via sequence (although there are some differences in what that means with transducers). The upshot is you now have more choices. 4. Resource management - because you have the ability to eagerly process an external input source, you have more knowledge about when it's "done" so you can release the resource. |