Hacker News new | ask | show | jobs
by HiPhish 235 days ago
> email.bulkSend(generateExpiryEmails(getExpiredUsers(db.getUsers(), Date.now())));

What makes it hard to reason about is that your code is one-dimensional, you have functions like `getExpiredUsers` and `generateExpiryEmails` which could be expressed as composition of more general functions. Here is how I would have written it in JavaScript:

    const emails = db.getUsers()
        .filter(user => user.isExpired(Date.now()))  // Some property every user has
        .map(generateExpiryEmail);  // Maps a single user to a message

    email.bulkSend(emails);
The idea is that you have small but general functions, methods and properties and then use higher-order functions and methods to compose them on the fly. This makes the code two-dimensional. The outer dimension (`filter` and `map`) tells the reader what is done (take all users, pick out only some, then turn each one into something else) while the outer dimension tells you how it is done. Note that there is no function `getExpiredUsers` that receives all users, instead there is a simple and more general `isExpired` method which is combined with `filter` to get the same result.

In a functional language with pipes it could be written in an arguably even more elegant design:

    db.getUsers() |> filter(User.isExpired(Date.now()) |> map(generateExpiryEmail) |> email.bulkSend
I also like Python's generator expressions which can express `map` and `filter` as a single expression:

    email.bulk_send(generate_expiry_email(user) for user in db.get_users() if user.is_expired(Date.now())
1 comments

I guess I just never encounter code like this in the big enterprise code bases I have had to weed through.

Question. If you want to do one email for expired users and another for non expired users and another email for users that somehow have a date problem in their data....

Do you just do the const emails =

three different times?

In my coding world it looks a lot like doing a SELECT * ON users WHERE isExpired < Date.now

but in some cases you just grab it all, loop through it all, and do little switches to do different things based on different isExpired.

  If you want to do one email for expired users and another for non expired users and another email for users that somehow have a date problem in their data....
Well, in that case you wouldn't want to pipe them all through generateExpiryEmail.

But perhaps you can write a more generic function like generateExpiryEmailOrWhatever that understands the user object and contains the logic for what type of email to draft. It might need to output some flag if, for a particular user, there is no need to send an email. Then you could add a filter before the final (send) step.

since were just making up functions..

    myCoolSubroutine = do
      now <- getCurrentTime
      users <- getUsers
      forM users (sendEmail now)

    sendEmail now user =
      if user.expiry <= now
        then sendExpiryEmail user
        else sendNonExpiryEmail user
The whole pipeline thing is a red herring IMO.
What language is this?
Looks like Haskell
> Question. If you want to do one email for expired users and another for non expired users and another email for users that somehow have a date problem in their data.... > > Do you just do the const emails = > > three different times?

If it's just two or three cases I might actually just copy-paste the entire thing. But let's assume we have twenty or so cases. I'll use Python notation because that's what I'm most familiar with. When I write `Callable[[T, U], V]` that means `(T, U) -> V`.

Let's first process one user at a time. We can define an enumeration for all our possible categories of user. Let's call this enumeration `UserCategory`. Then we can define a "categorization function" type which maps a user to its category:

    type UserCategorization = Callable[[User], UserCategory]
I can then map each user to a tuple of category and user:

    categorized_users = map(categorize, db.get_users())  # type Iterable[tuple[UserCategory, User]]
Now I need a mapping from user category to processing function. I'll assume we call the processing function for side effects only and that it has no return value (`None` in Python):

    type ProcessingSpec = Mapping[UserCategory, Callable[[User], None]
This mapping uses the user category to look up a function to apply to a user. We can now put it all together: map each user to a pair of the user's category and the user, then for each pair use the mapping to look up the processing function:

    def process_users(how: ProcessingSpec, categorize: UserCategorization) -> None:
        categorized_users = map(categorize, db.get_users())
        for category, user in categorized_users:
            process = how[category]
            process(user)
OK, that's processing one user a time, but what if we want to process users in batches? Meaning I want to get all expired users first, and then send a message to all of them at once instead of one at a time. We can actually reuse most of our code because how how generic it is. The main difference is that instead of using `map` we want to use some sort of `group_by` function. There is `itertools.groupby` in the Python standard library, but it's not exactly what we need, so let's write our own:

    def group_by[T, U](what: Iterable[T], key: Callable[[T], U]) -> Mapping[U, list[T]]:
        result = defaultdict(list)
        # When we try to look up a key that does not exist defaultdict will create a new
        # entry with an empty list under that key
        for x in what:
            result[key(x)].append(x)
        return x
Now we can categorize our users into batches based on their category:

    batches = group_by(db.get_users(), categorize)
To process these batches we need a mapping from batch to a function which process an iterable of users instead of just a single user.

    type BatchProcessingSpec = Mapping[UserCategory, Callable[[Iterable[User]], None]
Now we can put it all together:

    def process_batched_users(how: BatchProcessingSpec, categorize: UserCategorization) -> None:
        batches = group_by(db.get_users(), categorize)
        for category, users in batches:
            process = how[category]
            process(users)
There are quite a lot of small building block functions, and if all I was doing was sending emails to users it would not make sense to write these small function that add indirection. However, in a large application these small functions become generic building blocks that I can use in higher-order functions to define more concrete routines. The `group_by` function can be used for many other purposes with any type. The categorization function was used for both one-at-a-time and batch processing.

I have been itching to write a functional programming book for Python. I don't mean a "here is how to do FP in Python" book, you don't need that, the documentation of the standard library is good enough. I mean a "learn how to think FP in general, and we are going to use Python because you probably already know it". Python is not a functional language, but it is good enough to teach the principles and there is value in doing things with "one hand tied behind your back". The biggest hurdle in the past to learning FP was that books normally teach FP in a functional language, so now the reader has to learn two completely new things.

Your post was very interesting in terms of how to translate requirements to a functional solution. You should write that book on how to do that.