Hacker News new | ask | show | jobs
by kccqzy 54 days ago
That does not sound like a good example. The two-argument form of `map` already returns a lazy sequence. Same for `filter`. I thought lazy sequences are already supposed to get rid of the performance problem of materializing the entire collection. So
1 comments

Lazy sequences reduce the size of intermediate collections but they “chunk” - you get 32 items at a time, multiply that by however many transformations you have and obviously by the size of the items.

There are some additional inefficiencies in terms of context capturing at each lazy transformation point. The problem gets worse outside of a tidy immediate set of transformations like you’ll see in any example.

This article gives a good overview of the inefficiencies, search on “thunk” for tldr. https://clojure-goes-fast.com/blog/clojures-deadly-sin/ (I don’t agree with its near condemnation of the whole lazy pattern (laziness is quite useful - we can complain about it because we have it, it would suck if we didn’t).)

So what’s your coding style in Clojure? Do you eschew lazy sequences as much as possible and only use either non-lazy manipulation functions like mapv or transducers?

I liked using lazy sequences because it’s more amenable to breaking larger functions into smaller ones and decreases coupling. One part of my program uses map, and a distant part of it uses filter on the result of the map. With transducers it seems like the way to do it is eductions, but I avoided it because each time it is used it reevaluates each item, so it’s sacrificing time for less space, which is not usually what I want.

I should add that I almost always write my code with lazy sequences first because it’s intuitive. Then maybe one time out of five I re-read my code after it’s done and realize I could refactor it to use transduce. I don’t think I’ve ever used eduction at all.

It's evolving, and I'm using transducers more over time, but I still regularly am in situations where a simple map or mapv is all I need.

Lazy sequences can be a good fit for a lot of use cases. For example, I have some scenarios where I'm selecting from a web page DOM and most of the time I only want the first match but sometimes I want them all - laziness is great there. Or walking directories in a certain order, and the number of items they contains varies, so I don't know how many I'll need to walk but I know it's usually a small fraction of the total. Laziness is great there.

This can still work with transducers - you can either pass a lazy thing in as the coll to an eager transducing context (maybe with a "take n" along the way) or use the "sequence" transducing context which is lazy.

I tend to reach for transducers in places in my code where I'm combining multiple collection transformations, usually with literal map/filter/take/whatever right there in the code. Easy wins.

Recently I've started building more functions that return either transducers or eductions (depending on whether I want to "set" / couple in the base collection, which is what eduction is good for) so I can compose disparate functions at different points in the code and combine them efficiently. I did this in the context of a web pipeline, where I was chaining a request through different functions to come up with a response. Passing an eduction along, I could just nest it inside other eductions when I wanted to add transducers, then realize the whole thing at the end with an into and render.

Mentally it took me some time to wrap my head around transducers and when and how to use them, so I'm still figuring it out, but I could see myself ending up using them for most things. Rich Hickey, who created clojure, has said if he had thought of them near the beginning he'd have built the whole language around them. But I don't worry about it too much, I mostly just want to get sh-t done and I use them when I can see the opportunity to do so.

This, by the way, is why the lead example in the original linked post on clojure.org is very much like mine.