|
|
|
|
|
by icey
4013 days ago
|
|
It would be great to see a refactor of some code using transducers to get a better sense of what they're useful for. From an abstraction standpoint, I can see the attraction; but I am having a tough time imagining how it would improve code in practice. |
|
Transducers let you represent the transformation parts as an independent (reusable) thing:
You then apply that composite transformation in a single pass over the input: transduce combines (a) input iteration (b) transformation application and (c) what to do with the results - in this case apply + as a reduce. Other functions also exist that make different choices about how to combine these parts: into collects results into a collection, sequence can incrementally produce a result, eduction can delay execution, core.async chans can apply them in a queue-like way.There are a number of benefits here:
1. Composability - transformations are isolated from their input and output contexts and thus reusable in many contexts.
2. Performance - the sequence example above allocates two large intermediate sequences. The transduce version does no intermediate allocation. Additionally, some collection inputs are "internally reducible" (including range in 1.7) which allows them to be traversed more quickly via reduce/transduce contexts. The performance benefits of these changes can be dramatic as you increase the size of input data or number of transformations. If you happen to have your data in an input collection (like a vector), into with transducers can be used to move data directly from one vector to another without ever producing a sequence.
3. Eagerness - if you know you will process the entire input into the entire output, you can do so eagerly and avoid overhead. Or you can get laziness via sequence (although there are some differences in what that means with transducers). The upshot is you now have more choices.
4. Resource management - because you have the ability to eagerly process an external input source, you have more knowledge about when it's "done" so you can release the resource.