Hacker News new | ask | show | jobs
by tibbe 3614 days ago
Just sticking with the pure types there's currently no generic stream model that works well. No stream fusion system fuses all cases (even in theory) and they also fail to fuse the cases they're supposed to handle too often in practice.

I haven't looked at pipes, but I'm guessing it doesn't all fuse away either.

1 comments

You're right, I believe Haskell's fusion framework could be greatly improved (although it is the best production solution I'm aware of). However, how would you go about solving this? I don't think there's any generalized solution to the problem of creating no-overhead iteration from higher-level iterative combinators.
> Haskell's fusion framework could be greatly improved (although it is the best production solution I'm aware of). However, how would you go about solving this?

Given that we're in a rust thread... are you familiar with rust's iterator fusion [0]? Basically there are three components: iterators (something like a source), iterator adapters (where all manipulations happen), and consumers (something like a sink). LLVM will compile all the iterator adapters into a single manipulation such that the underlying stream/vector/whatever only goes through it once.

I personally like it much better than Haskell's. With rust the fusion is guaranteed to happen, although it makes the types a little verbose and tricky to work with, but with, e.g., Haskell's Text's stream fusion I was never really sure that it was working, or if I could do something to prevent it. It seems like in Haskell it's more of a behind the scenes optimization that you hope kicks in, rather than designed into the types. Or do I misunderstand? I only dabbled in Haskell.

[0] https://doc.rust-lang.org/book/iterators.html

Yes, I have used Rust a bit. Basically the primary difference is (and correct me if I'm wrong) you can't re-use an iterator in Rust without cloning it. On the other hand, you can use a Haskell pure stream object as many times as you want (without explicit cloning, because "draining" an iterator is stateful), so fusion becomes a bit of a more complicated problem.

If I had some Haskell code that was like

map f . map g . filter x . map y $ stream

It would almost certainly get fused into a single low-level loop without extraneous allocations. However, I can also do something like

foo = map y $ stream

bar = map f . map g . filter x $ foo

baz = map z $ foo

And now what do you do?

Haskell's fusion is also more general, because it allows you to do pretty much arbitrary syntactic transformations.

Unfortunately, this means it's somewhat fragile and is easy to prevent from functioning. Rust can guarantee fusion because you're restricted in the kinds of things you can do with iterators.

On the other hand, Haskell's Pipes restrict you from doing things like re-using an iterator, and I'm not sure what the optimization story is there.

That's one of the ideas behind transducers:

https://github.com/matthiasn/talk-transcripts/blob/master/Hi...

And they work! It's not stream fusion, but the composed functions being applied to whatever container or stream of values are applied per-value, so (map (comp xf1 xf2)) applied to [1 2 3] applies (xf2 (xf1 1)), (xf2 (xf1 2)), and so on, with similar allocation savings to stream fusion.