Hacker News new | ask | show | jobs
by daenz 2506 days ago
Haskell is dense. I'm picking it up now[0], so this project was very useful for me to see how some "real world" Haskell is written. But yes, take for example this function

  getSocketAPIPort :: Int -> IO Int
  getSocketAPIPort defaultPort = do
    maybeEnvPort <- lookupEnv "socketPort"
    case maybeEnvPort of
      Nothing   -> return defaultPort
      Just port -> maybe (return defaultPort) return (readMaybe port)
It gets a port from an environment variable, if it can, otherwise a default port. Conceptually, this is easy to understand, but translating your understanding of that process to Haskell is not 1 to 1. For example, the last line alone:

  Just port -> maybe (return defaultPort) return (readMaybe port)
You have to understand Maybe (and failure contexts), you have to understand that "return" does not return from a function like typical imperative languages, instead it wraps a type in a Monad (in this case, the IO Monad), and you also have to understand that most of those things on that line, including the "return" function, are parameters to the "maybe" function.

There's a lot to understand in terms of the underlying machinery of Haskell to be able to read its cryptic flow and syntax. But it is worth it imo.

0. http://learnyouahaskell.com

7 comments

Ouch. That function is actually written in a pretty convoluted and redundant way.

    getSocketAPIPort :: Int -> IO Int
    getSocketAPIPort defaultPort = do
        maybeEnvPort <- lookupEnv "socketPort"
        return . fromMaybe defaultPort $ maybeEnvPort >>= readMaybe
That's starting to border on over-terse, so you could expand the bind operator into do notation if you wanted to spread it out a bit further. On the other hand, it's also starting to feel over-verbose, using do notation for only a single IO action. Maybe...

    getSocketAPIPort :: Int -> IO Int
    getSocketAPIPort defaultPort = fromMaybe defaultPort . (readMaybe =<<) <$> lookupEnv "socketPort"
That might be going too far. But maybe it's what I'd write. Just depends on how much I expect to make this more complicated in the future. This form has the simplest flow to read. I mean... it's dense. Really dense. But it has the fewest total things going on, and it neatly divides into three interesting parts, easily understood in isolation, plumbed together with two common combinators. But it's also pretty rigid in structure. If you ever want to add other sources for finding the port or change the priorities of them, that form would need to be totally rewritten, and probably would end up back in do notation.

But in every case, all the various return calls should be combined into one (or none, if you use fmap or <$>), and the fallback to the default should only be written once.

In your first version, I think I would use =<< to keep "flow" of information right-to-left. I also use parentheses instead of . and $ when there isn't a lot of nesting, I got that habit from this post about writing legible Haskell http://www.haskellforall.com/2015/09/how-to-make-your-haskel...

    getSocketAPIPort :: Int -> IO Int
    getSocketAPIPort defaultPort = do
        maybeEnvPort <- lookupEnv "socketPort"
        return (fromMaybe defaultPort (readMaybe =<< maybeEnvPort))
Keeping the pattern match instead of using "fromMaybe" wouldn't be a bad idea, either:

    getSocketAPIPort :: Int -> IO Int
    getSocketAPIPort defaultPort = do
        maybeEnvPort <- lookupEnv "socketPort"
        return (case readMaybe =<< maybeEnvPort of
            Nothing   -> defaultPort
            Just port -> port)
It makes the default value stand out a bit more.
I've written a fair bit of Haskell, including some production software. Your second version here is my personal favorite of the variants proposed so far. It's the version I can look at and more or less instantly understand. I think sometimes folks go a little too far with using library functions to manipulate Maybe values -- not that `fromMaybe` is particularly onerous, but something about the structure of pattern matching just conveys information to my brain much faster.
For my part, for that logic I'd consider MaybeT. I don't like that a malformed environment variable gets the same treatment as a missing environment variable, though.
I feel like I'm in a unique place of learning Haskell, so I'll try to translate the last line of your first function:

  return . fromMaybe defaultPort $ maybeEnvPort >>= readMaybe
Basically this is composing a function out of "return" and "fromMaybe" (using the composition operator "."), then partially applying defaultPort to that composed function, so you now have a function that takes one argument. The resulting function is then applied (using $) to the result of "maybeEnvPort >>= readMaybe".

In "maybeEnvPort >>= readMaybe", ">>=" is an infix function that takes maybeEnvPort as its first argument (which is a Maybe Monad), "unpacks" it, applies "readMaybe" to the unpacked result. readMaybe returns another Maybe Monad.

The result of everything after the $ is a Maybe Monad that contains the port from the environment, or a failure condition. The result of applying the composed-and-partially-applied function (from before the $) to it is that the port from the environment is chosen if it didn't fail, otherwise the defaultPort is used, and then the whole thing is wrapped in an IO Monad.

The only place I would offer a correction is to say that values aren't monads. "Maybe Monads" and "IO Monads" aren't getting created or passed around because only values exist at runtime.

The only thing that can be a Monad is a type. You could say you have a value of a monadic type, I suppose...

But that gets into something I've learned over time answering beginner questions. Call things types or values. "a Maybe value" (this is a little sloppy, but perfectly fine in conversation) or "the IO type". Don't call types with a Monad instance "Monads" except in the case when you are talking about all of them generically. "The IO Monad" is an incredibly self-limiting and distracting way to think about the IO type. There's nothing inherently interesting about being a Monad. Why not call it "the IO Functor" or "the IO Alternative" or even "the IO MonadRandom"? Those are all instances the type has. None are particularly more important than the rest. Sometimes what you want to do is most easily done via a type class other than Monad. Don't tie yourself so much to a single detail. This is actually really important, because our habits shape our intellectual exploration. When you find a habit that shoehorns you into one direction, it's a good idea to try to weaken it.

i like it, for some reason my brain dislikes switching from left/right a lot, so that's why i often avoid the $ operator, so instead I'd write:

    return . fromMaybe defaultPort (maybeEnvPort >>= readMaybe)
"Have the data flow in one direction" is a good rule of thumb in writing clear Haskell code.

That said, your conversion away from $ changes the meaning here (in a way that doesn't typecheck, I think - remember that regular function application binds tightest whereas dollar binds loosest) and you still don't achieve your goal.

Instead, maybe

   return . fromMaybe defaultPort $ readMaybe =<< maybeEnvPort
or even

   return $ fromMaybe defaultPort $ readMaybe =<< maybeEnvPort
If you still don't like the dollar signs, we can parenthesize instead in two correct ways, although I don't find them more readable:

    (return . fromMaybe defaultPort) (readMaybe =<< maybeEnvPort)

    return (fromMaybe defaultPort (readMaybe =<< maybeEnvPort))
tbf, I (as a person with some very basic haskell understanding) can more or less read the parent's code, but can't make heads or tails out of yours.

"Convoluted and redundant" is in the eye of the beholder I guess

Is it?

There's no part of you that looks and that and wonders why it's stuffing return into every leaf of a branching structure instead of just leaving it at the root? That's just objectively redundant.

And there's no part of you that's wondering why it's using nested branches to implement the railway oriented programming pattern? That's just objectively more convoluted than using the combinators that abstract that out and coalesce all the failure branches into one spot.

My second version has an extra really nice property. It consists of three subexpressions that can be understood in totality in isolation from the rest of the code. It is compositional code of the sort we all claim we want to work with.

What my code does have as a real downside is a much higher burden of knowledge to understand. You have to know much more of the contents of the base library. You have to be familiar with how idioms like the aforementioned railway oriented programming work.

But that knowledge has its rewards. You get to reduce manual plumbing in your own code, replacing it with standard library plumbing. When you know Haskell, the standard plumbing fades into the background. I guess it's like what lispers talk about with their parenthesis.

So yes, there is an additional burden in understanding my versions of the code. But that burden amortizes very nicely over a lifetime of getting the advantages of having all that plumbing just there when you need it.

Yes, Haskell is the one language where you can always invest some more time learning something more advanced that will provide you a large boom in productivity.

That has the downside that Haskell developers speak many different idioms, just like Lisp. I'm prone to claim that the code you posted is basic enough that we can consider that people that don't get it are not proficient on the language yet, but there are way too many things right on the fence for that, and they can't all be required.

> So yes, there is an additional burden in understanding my versions of the code. But that burden amortizes very nicely over a lifetime of getting the advantages of having all that plumbing just there when you need it.

I definitely didn't argue this point (although I admit to skepticism).

It sounds like you're actually aware that it's easier to read the redundant code when you are not an expert, so we don't have any disagreement there.

> It sounds like you're actually aware that it's easier to read the redundant code when you are not an expert, so we don't have any disagreement there.

Actually, there is minor disagreement. I don't think I used anything requiring expert-level understanding. I would put the tools I used at the level of day-to-day proficiency, not expert level. Roughly, the level it took me 3 months to reach, not the level I'm still working towards after 10 years.

If you've wandered in from another language and are wondering what it might look like in a less terse language, here's an attempt:

    static IO<Integer> getSocketApiPort(@NotNull final Integer defaultPort) {
      return lookupEnv("socketPort")
        .flatMap((Optional<String> maybeEnvPort) -> {
          if(!maybeEnvPort.isPresent()) {
            return IO.of(defaultPort);
          } else {
            String strEnvPort = maybeEnvPort.get();
            Optional<Integer> envPort = readMaybe(strEnvPort);
            return IO.of(envPort.orElse(defaultPort));
          }
        });
    }
It's some time since I last worked in Haskell (and I never worked on anything useful), but I would write the function this way:

  getSocketAPIPort :: Int -> IO Int
  getSocketAPIPort defaultPort = do
    maybeEnvPort <- lookupEnv "socketPort"
    return $ case maybeEnvPort of
      Nothing   -> defaultPort
      Just port -> fromMaybe defaultPort (readMaybe port)
100% this is how I'd write it (except probably with pure instead of return). I don't get the desire to sprinkle bind throughout the other replies in this thread. I feel like it makes things less clear.
Though notice that you picked literally the most trivial example in the entire codebase that anyone can understand without explanation.

Elm is onto something with its obsession with simplicity and lack of features. I go back to old Haskell code and have to completely recredentialize in Haskell before I remember what's going on. I return to old Elm code and need very little ramp up.

I'm not making an "Elm > Haskell" argument, I just think experimentation with simplicity does this family of languages a favor.

> Though notice that you picked literally the most trivial example in the entire codebase that anyone can understand without explanation.

I needed the explanation to understand what the code snippet was doing. Careful with those generalisations.

Well, non-programmers are even more helpless in understanding the code, but that's just not the point I'm making. Let's not require everyone to couch every statement in disclaimers, especially such an ancillary one.

And you surely can understand that code is trying to read a port or use a default one.

> And you surely can understand that code is trying to read a port or use a default one.

In programming languages which use paradigms I'm more familiar with, yes. In Haskell, not so much, which is the entire point of this thread.

Weird, I thought the point of this thread was how to bikeshed looking up environment variables while folks look at the proverbial camera and appeal to the hypothetical audience that `if x == null: ` would surely be morally superior.
> In Haskell, not so much, which is the entire point of this thread.

Yes, that's my point, too. I'm not sure what you're arguing with.

That you don't understand even the most trivial snippet in the code base is only a point in my favor. You're picking beef with an irrelevant detail and confusing it for disagreement.

Remember, I was replying to someone who is trying to show that Haskell isn't so hard once you break down a snippet. I pointed out that the snippet was the most trivial selection they could have picked, that the rest of the code is even harder so it's not a very big consolation. You chimed in that even the simplest snippet was still alien to you.

>Remember, I was replying to someone who is trying to show that Haskell isn't so hard once you break down a snippet.

That's not what I was trying to show, and I'm not sure how you got that sense. I was trying to show that even the most simple snippet requires a lot of background knowledge to understand.

On the other hand, Haskell's strength is its abstraction power. Elm intentionally lacks (some of) that punch, as do a lot of other languages.

But since Elm is a DSL it can afford to cap the abstraction somewhere whereas Haskell is perhaps built for more complex problems than just a client facing web UI.

Having learned both, I feel the abstraction level is capped too low on Elm, sadly.

Do you have a better example from this code base? I'm looking but unfortunately I'm not seeing anything that isn't either trivial or just a lot of monad stack handling.
Better examples of how Haskell can make the OP's eyes glaze over? Basically the rest of it.
monad transformers can help alot here:

  getSocketAPIPort :: Int -> IO Int
  getSocketAPIPort defaultPort =
    fromMaybe defaultPort <$> runMaybeT do
      envPort <- MaybeT $ lookupEnv "socketPort"
      MaybeT $ return $ readMaybe envPort
in python:

  def getSocketAPIPort(defaultport):
       try:
           return os.environ["socketPort"]
       except KeyError:
           return defaultport
Close, but values from os.environ are always string, and not guaranteed to be parsable numbers.
This is a bit clearer:

    def getSocketAPIPort(defaultport): 
      return int(os.getenv('socketPort', defaultport))
Closer, but int can throw a ValueError. The original Haskell code defaulted to the defaultPort if the specified port could not be cast to an int.
I see. Though I think it's better to throw an error. I wouldn't want my app just coming up on default port if I had configured it incorrectly.
It might be in Python, but it's sorta oranges to apples for you to just not do the same thing your Rosetta code is supposed to do.

What's the value of your exercise if you just redefine the problem to get rid of the tricky part?

here's ho I'd write it:

    getSocketAPIPort :: Int -> IO Int
    getSocketAPIPort defaultPort = readWithDefault <$> lookupEnv "socketPort"
      where 
        readWithDefault mbPort = fromMaybe defaultPort $ readMaybe =<< mbPort
Though I should add that it's in general against the Haskell spirit to write code against needlessly specific types and to combine unrelated functionality. So, I'd first look for a function or write one myself in some utils module that looks up and parses a value from an environment variable:

    readEnv :: Read a => String -> IO (Maybe a)
    readEnv var = (readMaybe =<<) <$> lookupEnv var
Then the function in question becomes much easier (and probably unnnecessary too):

    getSocketAPIPort defaultPort = fromMaybe defaultPort <$> readEnv "socketPort"