Hacker News new | ask | show | jobs
by ufo 3871 days ago
> How would you do that with map and filter functions without looping over the entire (inexhaustible) sequence?

But isn't this just a matter of defining map and filter over a lazy stream datatype (aka iterators) instead of over lists?

So the workflow would be

   list -> stream -> filter1 -> filter2 -> list
This would let you run the filters "in parallel" without iterating through things twice.
1 comments

Python's map and filter already run over iterators. If you're talking about creating filters that are composable, so that you could do

  filter1 = filter(predicate1)   # curried filter
  filter2 = filter(predicate2)   # curried filter
  pipeline = compose(filter1, filter2)
  pipeline(<generator of some kind>)
and make it so that the sequence of operations will be

  for el in some_generator:
    if predicate1(el) and predicate2(el): <accumulate value>
Then I would say defining them in this way is useful and important--transducers are exactly one way of doing this! A critical aspect I didn't go into detail on is the idea of a take function. In the github repo, T.take(3) is the portion that allows the transducer to operate of an infinite stream of values.

This is the piece your workflow example would need to take into account. How could I apply a filter followed by something that takes 3 passing values from an infinite sequence? I'm sure you could come up with a way, and it would be worth comparing to the transducer approach :).

(it's worth noting that currying map, filter, etc.. are very complimentary to the transducer approach)

If you are wondering how to make a lazy take() in python, here is one solution:

    def take(num):
        def gen(iterable):
            for i, item in enumerate(iterable):
                if i == num:
                    break
                yield item
        return gen
and then

    list(take(3)(range(100)) == [0, 1, 2]