Hacker News new | ask | show | jobs
by treis 1552 days ago
FP explanations like this annoy me:

That is, this code:

x = foo()

y = bar(x)

z = bar(x)

Should mean the same thing as:

y = bar(foo())

z = bar(foo())

Because it's obvious to anyone that has written any sort of complex program that those are not the same thing. foo() can be an expensive operation like an HTTP call. Or it might depend on a database which can change state underneath it.

I assume FP has answers for these things but the tutorials never cover them. They all imagine a world without state or expensive operations to show how wonderful it is. And that's an easy world to program in.

5 comments

> Because it's obvious to anyone that has written any sort of complex program that those are not the same thing.

They are the same thing in Haskell (except for when forcing the thunks into eager values happens, due to the weirdness of laziness, but that has nothing to do with purity).

> I assume FP has answers for these things but the tutorials never cover them.

Except this one does:

> The <- works like an =, except it signals that equational reasoning doesn't apply to this value. You can't replace what_the_user_typed with getline - your program won't compile.

>They are the same thing in Haskell

That just raises a different problem:

y = bar(generateUUID())

z = bar(generateUUID())

>Except this one does:

It doesn't explain it in the context of real world programming.

> That just raises a different problem:

    y = bar(generateUUID()) 
    z = bar(generateUUID())
You can't write that in Haskell! Instead you write something like.

  yuuid <- generateUUID
  zuuid <- generateUUID

  y = bar yuuid
  z = bar zuuid
Yeah, I'm not denying that these things are solved in Haskell or other FP languages. People build complex applications in those languages so all things must be possible one way or another. My complaint is that the tutorials never wade into these weeds and show how FP makes real world applications easier to build.
> the tutorials never wade into these weeds and show how FP makes real world applications easier to build

Yes that's true, and it's unfortunate.

Example program to show three ways of doing that:

  import System.Random

  randomInt :: IO Int
  randomInt = randomIO

  isEven1 :: Int -> Bool
  isEven1 n = (n `rem` 2) == 0

  isEven2 :: IO Int -> IO Bool
  isEven2 n = do
      m <- n
      return $ isEven1 m

  example1 :: IO ()
  example1 = do
      i1 <- randomInt
      let result1 = isEven1 i1
      putStrLn $ show result1

      i2 <- randomInt
      let result2 = isEven1 i2
      putStrLn $ show result2

      let result2' = isEven1 i2
      putStrLn $ show result2'

  example2 :: IO ()
  example2 = do
      result1 <- isEven2 randomInt
      putStrLn $ show result1

      result2 <- isEven2 randomInt
      putStrLn $ show result2

  randomIntIsEven :: IO Bool
  randomIntIsEven = isEven2 randomInt

  example3 :: IO ()
  example3 = do
      result1 <- randomIntIsEven
      putStrLn $ show result1

      result2 <- randomIntIsEven
      putStrLn $ show result2

  main :: IO ()
  main = do
      putStrLn "Example 1:"
      example1
      putStrLn "Example 2:"
      example2
      putStrLn "Example 3:"
      example3
In example1, result1 and result2 may differ, which is signalled by the arrow (instead of "="). result2 and result2' cannot differ, which is signalled by "=", where the right hand side is verbatim the same.

In example2, result1 and result2 may differ, which is signalled by "<-" (instead of "=").

In example3, result1 and result3 may differ, which is signalled by "<-" (instead of "="). Another argument here is that they may differ, because example3 is the same as example2, except it used randomIntIsEven in place of isEven2 randomInt. But we defined randomIntIsEven to be equal to isEven2 randomInt, so result1 and result2 being able to be different in example2 but not in example3, or vice versa, would be a contradiction.

that is _not_ possible in haskell, simply. generateUUID needs a random number generator, thus it probably has a type like

  generateUUID :: IO UUID
which means that those two lines don't _really_ make sense: you'd get a compile error. to use side effects (which in this case, IO, mostly means opting into sequencing of actions) you'd have to write

  uuidY <- generateUUID
  y = bar uuidY
  uuidZ <- generateUUID
  z = bar uuidZ
in which case, equational reasoning still holds. Side-note: in haskell, you could write it while avoiding the middle line using either the left or right bind operators, so you could write it as

  y = bar =<< generateUUID
  z = generateUUID >>= bar
(mnemonically: the first one runs an action and pipes it to the left, the second one does the same but pipes to the right)

Side-note #2: that's actually not too far away from how random number generators work in haskell: either you pass the state around, or you do it in IO.

> foo() can be an expensive operation like an HTTP call. Or it might depend on a database which can change state underneath it.

Not in Haskell! I recommend reading the rest of the post.

Should mean the same thing as means has the same result value here. The operational semantics could be different, as you mention: even without side effects, computing x may be very expensive.

This isn’t even special to FP, it’s a basic problem when building a compiler, namely whether inlining/common subexpression elimination is beneficial.

To drive the point home, even in a language like Haskell these two examples might have a factor of two in execution time between them.

In theory you're right, but in this specific case they really are the same - the compiler has an optimization pass called "common subexpression elimination" that converts the second to the first in nearly all cases.
They're not exactly the same:

Case 1:

x = foo()

y = bar(x)

z = bar(x)

Case 2:

y = bar(foo())

z = bar(foo())

In case 1 the value of x must be kept in memory across the entire execution of bar(x) since you need it for the second invocation of bar. In case 2, the result of foo() can be discarded once bar is done with it.

To use a contrived example, imagine a program that has 100 mb of available memory.

foo() returns a data struct that is 95 mb in size.

def bar(some_big_data_struct)

   some_small_data = some_big_data_struct[:some_key]

   [A bunch of other code that allocates and then releases 90 mb]

   return some_other_small_data
end

In Case 1 I get an OOM error. In Case 2 I (or the runtime) can reclaim the 95 mb that some_big_data_struct uses and the program works.

Clearly that's a contrived situation but it illustrates my original point. There's a huge gap between theoretical pure functions and what we have to deal with in the real world. These sorts of FP tutorials never go into the weeds and explore these problems.

> foo() can be an expensive operation like an HTTP call. Or it might depend on a database which can change state underneath it.

It can't, in Haskell, because it's a pure language. That's basically the definition of pure! But you still need effects, even in a "pure" language, and the whole point of the article is about how to support effects in a pure language.

Actually in some FP languages like Haskell those two might be literally the same thing. When the compiler sees the second, it does the first (obviously over simplifying all over the place here, but that's the idea I think is being expressed).