Hacker News new | ask | show | jobs
by S4M 4154 days ago
I don't know much about Haskell, but I thought it had some properties to isolate side effects, but the code he gives:

    main = do
        cd "/tmp"
        mkdir "test"
        output "test/foo" "Hello, world!"  -- Write "Hello, world!" to "test/foo"
        stdout (input "test/foo")          -- Stream "test/foo" to stdout
        rm "test/foo"
        rmdir "test"
        sleep 1
        die "Urk!"
Clearly doesn't (it creates a directory, writes in a file, removes that file and that directory all in one go without anything indicated by the function main. Is it because it's the main function of the program, or am I missing something?
7 comments

People are talking about monads and stuff like that. No need to worry about maths and words you don't need to know. That 'do' keyword up there indicates the start of a simple DSL. The DSL goes like this, every line is the beginning of a lambda. And the result of each lambda evaluation is passed into the next lambda.

So you get a sort of cascading scope of lambdas where the result of each lambda is available passed into the next. Each lambda depends on the evaluation of the previous one. Normally in Haskell functions are executed lazily, this structure forces the sequential evaluation.

So what are these cd, mkdir, output etc functions? They return an object with a specific type called 'IO'. This type is monadic, but that's irrelevant for now. Haskell as you know has no side effects in the language itself. The IO type basically is a command pattern, it says "execute this I/O with these parameters".

The monadic aspect of IO makes it so that at the end the commands will have accumulated in a list, of which you can get an item if you give it the results of the previous item. So that's what the main function returns, a list of commands with some lazily evaluated Haskell code in between them. Now comes the side effect part. The Haskell runtime system iterates over the list of commands and executes them. The result of each command is used to get the next command in the list.

So that's the core of the magic trick of monadic I/O, you make a lazy list of I/O commands, and have something external to the language execute those I/O commands, giving the results back to the language to get the next I/O command to execute.

Haskell functions return side-effects using the IO type, with the boilerplate plumbing being hidden with monads and do-notation. "main" in Haskell by default has a return type of "IO ()", and any "IO" values returned by that function are executed by the runtime.

The end result in this case is something that just looks and feels completely imperative.

but if you were to try and call, say, the "rmdir" function inside another function that didn't have an IO return type, you'd get a compile error. (More specifically, you could technically call the function, you just couldn't return the "IO" value as a result, so it couldn't perform any actions).

> IO type

:)

Haskell does not force you to indicate that a function has possible side effects in the program source code, the type of the function will however have such an indication. Here main has type IO (), and the IO indicates it can do arbitrary I/O and mutation. Haskell will infer the type for you so you don't need to declare it in the source code.

So I disagree with the claim "without anything indicated by the function main", and would amend to "without anything indicated explicitly by the source code, leaving the only indication in the inferred type".

It's in a do block, so you can see that these are not simple function calls; they're being composed via some monad. In cases where you're actually chaining values together it's more obvious which things are which:

    do
      value1 <- function1 --effectful function
      let value2 = function2(value1) --pure function
      value3 <- function3(value1, value2) --effectful function
      ...
But the notation is a bit more magic for this "no return" case; I prefer the Scala approach where even if you don't care about the return values you'd have to write this as

    for {
      _ ← cd "/tmp" // effectful function
      _ ← mkdir "test" //effectful function
      _ = someCalculation() //pure function
      ...
If you enable -Wall, Haskell forces you to use "_ <-" for any action that has a stateful result.

But the "let" vs no "let" is a pretty strong hint anyway :)

Control.Monad.void is your friend.
Data.Functor.void you mean? :)
That's because the whole block you're pointing to there is in Haskell's side effects box. So this is simply not the right example for illustrating how Haskell does isolate side effects. Shell scripts in general are very side effecting, so for this application it makes sense.
You are right. In this case effects are not isolated. But in this particular script, there are no interesting things to move into a pure function. It does not mean that it wouldn't be the case in a more complex script.

Like everything, you have to learn to balance your IO code and your pure code. A bit like learning when to factor something into a separate class, or leave it in a few statement/methods. If you write everything in IO & do notation, you don't get the main benefits of haskell. But if your code is more than 10 lines long, chances are that there will be useful pure functions in it.

I agree about the benefits of isolating side effects and IO - I generally code in python, and my code tends to look like:

    def main_function(args):
        data = get_data(args)
        result = do_calculations(data)
        push_results(result, args)
Where the function do_calculations is somewhat pure - no side effects, but I do use local variables that I modify inside the function.

> You are right. In this case effects are not isolated. But in this particular script, there are no interesting things to move into a pure function. It does not mean that it wouldn't be the case in a more complex script.

Well, I thought that the point of Haskell (of one of its points) is that it forces the programmer to declare whatever side effect in the type of the function. But here, there is no way to know that main, on top of printing stuff, also messes up with the directory and there is no type signature indicating it - in this example it's no big deal but I could write something like:

    main = do
        rm "/"
        sleep 1
        die "Oooops my files!"
You're right that Haskell doesn't distinguish different IO actions other than by the type they return. There are certainly libraries that do this though, although they aren't widely used.

Even though Haskell doesn't distinguish different classes of IO actions, it still distinguishes IO actions from other kinds of actions (such as stateful actions as per your example), and pure computations and that provides a hell of a lot of bang for buck.

The Idris language has the notion of effect types [1] to make achieving the goal of categorising the kinds of effects being used in a function easier to deal with, but that uses the dependent capabilities of the language.

[1] http://eb.host.cs.st-andrews.ac.uk/drafts/eff-tutorial.pdf

> Not widely used...

I use them in every single application I write, or I make my own tighter, more specific ones. They're incredibly useful in real world apps.

Sorry Tel, I didn't mean to imply that nobody uses them, more that I would guess that they are used less than 1% of the time where their inclusion could be beneficial.
Ha, no offense. I just wanted to emphasize that they are used.

In particular, I think they're more useful in applications than libraries and most Haskell code you can find in the wild is library code---so you end up not seeing them much.

`main` is the entry point of the program. Its type is `IO ()`. It is a warning to omit a type signature in a top level definition, so in a normal program you would have a `main :: IO ()` type definition. The `IO` allows any kind of effect.
Note that style of Pyhton uses block/strict IO (not streaming/lazy) which makes it rather inefficient if any part of he process exits early (due to error or because the user only wanted the first line of output). Most common command line programs (and simple Haskell programs) are streaming not block.
Yes, the code uses the IO monad, you might want to look it up.