Hacker News new | ask | show | jobs
by lkitching 1783 days ago
The purpose of Maybe is to explicitly represent the possible non-existence of a value which in Haskell is the only option since there's no null value which inhabits every type. The existence of the monad instance is convenient but it's not fundamental. The type of getConfigurationDirectories could be changed to MaybeT IO (NonEmpty FilePath) to avoid the match but I don't think it would make such a small example clearer.
1 comments

There are numerous ways to redesign the function signatures, but I would imagine the simplest would be (again, idk Haskell syntax):

    getConfigurationDirectories: unit -> Maybe [FilePath]
    nonEmpty: [a] -> Maybe [a]
    head: [a] -> Maybe a
    initializeCache: FilePath -> unit
    
Notice `nonEmpty` isn't really necessary because `head` could to the work. The above could be chained into a single, cohesive stack of calls where the result of each is piped through the appropriate `Maybe` method into the next call in a point-free style. I cannot imagine how this wouldn't be clearer. e.g:

    maybeInitialized <- (getCofigurationDirectories >>= head >> initializeCache)
That's the whole thing. Crystal clear. The big takeaway of "Parse don't validate" should be about the predominant use of the `Maybe` monad as a construct to make "parsing" as ergonomic as possible! Each function that returns `Maybe` can be understood as a "parser" that, of course, can be elegantly combined to achieve your result.

My critique is exactly that unwrapping the `Maybe` immediately in order to throw an exception is kind of the worst of both worlds. I mentioned this in a sibling comment, but my sense is that the author is more concerned with have a concrete value (`configDirs`) available in the scope of `main` than best-representing the solution to the problem in code. It is a shame because I agree with the thesis.

On the contrary the The NonEmpty type is fundamental to the approach in that example since it contains in the type the property being checked dynamically (that the list is non-empty). The nonEmpty function is a simple example of the 'parse don't validate' approach since it goes from a broader to a more restricted type, along with the possibility of failure if the constraint was not satisfied. The restriction on the NonEmpty type is what allows NonEmpty.head to return an a instead of a (Maybe a) and thus avoid the redundant check in the second example. The nonEmpty in your alternative implementation is only validating not parsing since after checking the input list is non-empty, it immediately discards the information in the return type. This forces the user to deal with a Nothing result from head that can never happen. Attempting to clean the code up by propagating Nothing values using bind is just hiding the problem that the validating approach avoids entirely.
You are misunderstanding the system. You can organize the logic into whatever containers you want, but the essence of the system cannot be changed.

You are already handling a `Maybe` type because it's possible for your input to not exist. Because the first implementation of `head` also returns a `Maybe`, it is possible to "bind" them together (I'm leaving out `IO` because I am both unsure of the syntax[0] and it is immaterial to the example):

    head :: [a] -> Maybe a
    head (x:_) = Just x
    head []    = Nothing

    getConfDirs :: Maybe [FilePath]

    initializeCache :: FilePath -> Cache
    
    useCache:: Cache -> Value 

    main :: ()
    main = do

      // you don't need concrete values here
      maybeCache <- (getCofDirs >>= head >> initializeCache) // Maybe Cache
      
      // one option
      case maybeCache of
        Just c -> useCache c
        Nothing -> error "CONFIG_DIRS cannot be empty"

      // another option
      maybeValue <- (maybeCache >> useCache) // Maybe Value
      
[0] I have never written Haskell, so the above is my best-guess at the syntax given the snippets available (and no extra research)

The two functions `head` and `getConfDirs` are "parsers" because they both return `Maybe`. Contrary to

> Returning Maybe is undoubtably convenient when we’re implementing head. However, it becomes significantly less convenient when we want to actually use it!

It is trivial to use a reference to `Maybe` because it is a monad that it is specifically designed to be used more conveniently than the alternative approaches in the case when a value may (or may not) exist.

This line:

    maybeCache <- (getCofDirs >>= head >> initializeCache)
is doing exactly what the post is arguing against. getConfDirs is validating the list is non-empty but the [FilePath] list it contains does not encode that information. Now you immediately have to handle the possibility of a missing value from head that you already know cannot happen. This isn't too apparent here since you've combined it into a single expression but if you need to pass the confDirs list to any other part of the program they will also have to continually handle the possibility of the list being empty even though you already checked for that possibility. Now every function that interects with the confDirs list will have to include (Maybe a) in its return type unnecessarily. The post is not suggesting you can remove Maybe entirely but it has moved it to a single point in the program (the point where the config dirs list is checked for emptiness) and removed it everywhere else. Your approach must continually guard against an impossible condition everywhere the dirs list is accessed because you discard the property you checked for in getConfDirs.

The monadic operators make it convenient to propagate missing values through a chain of operations but they are not the primary benefit of an explicit Maybe type. Much like IO, the benefit of having an explicit Maybe type is when you _don't_ have it since its absence represents more information at that point in the program. Likewise a (NonEmpty a) contains more informatation than [a] which consequently makes the implementation of head more informative.

The parsers in this approach have types like

    a -> Maybe b
where type b contains the extra information extracted by the parser. Your getConfDirs function only contains a function with type

    [a] -> Maybe [a]
so isn't parsing in the same way.
I understand what the author is doing. I said this earlier but it bears repeating, the author seems to be more concerned with having a concrete type than simpler code. A reference to `Maybe Cache` is good enough (and preferred). The top-level of your program is precisely where you want to have the flexibility to deal with the above.

Furthermore, my example is a much better illustration of the axiom ("Parse don't validate") than what the author is doing -- which is more like "Parse and validate".

You need to clarify "continuously guard". Sure you have to invoke methods like:

    maybeCache >> useCache // map 
instead of:

    maybeCache |> useCache // not sure how Haskell pipes
Is that too difficult? The `Maybe` monad is specifically designed so that you don't have to continuously guard against the possibility of the value not existing. That is, you can "map", "bind" and "apply" functions to the value as if it always exists (and it handles the situation when the value doesn't). I also included a `case` block within which you can be statically certain a value of type `Cache` is available if you really need it.

The purpose of `Maybe` is to simplify code that needs to deal with a value that might not exist. Attempting to organize your code to avoid using `Maybe` is, by definition, going to be more cumbersome than simply leaning into the construct (that's what it's for!). It also better-illustrates how "parse don't validate" should work. Using an exception to guard against an invariant is... validating not parsing.

You don't need to defend the author here. It's just a matter of fact the the code provided could be organized differently according to a more idiomatic usage of `Maybe`, and therefore a more illustrative example of their own point. The choice to exemplify something else is unfortunate and the thrust of this entire comment thread -- I felt like I had to say something now seeing that link a second time.

The author explains what they mean by parsing in the post:

> Really, a parser is just a function that consumes less-structured input and produces more-structured output. By its very nature, a parser is a partial function—some values in the domain do not correspond to any value in the range—so all parsers must have some notion of failure. Often, the input to a parser is text, but this is by no means a requirement, and parseNonEmpty is a perfectly cromulent parser: it parses lists into non-empty lists, signaling failure by terminating the program with an error message.

So the properties checked by the parser are reflected in the output type. Reifying these properties in the type is what allows the validation to be done once at the top level and avoided throughout the rest of the program. Your complaint about throwing exceptions is focusing on an irrelevant detail in a small example - yes this could have been moved into the main function but doesn't affect the overall behaviour.

However your argument that propagating Maybe values is more idiomatic than parsing into a more precise type is one I - and I assume most - static typing advocates would disagree with. Given the choice you would always prefer an 'a' over a 'Maybe a' since a Maybe represents a point of uncertainty which you would rather not have. As a result, having to chain this imprecision using various combinators is inherently more complex than not having to do so. Yes, using bind etc. is preferable to manually destructing Maybe values but avoiding Maybe is more preferable still.