Hacker News new | ask | show | jobs
by quanticle 1179 days ago

    That might seem the case from afar, but once you start writing Haskell and
    start experimenting these space leaks, you will notice that:

    1. 90% of the space leaks you write end up adding a tiny amount of memory
    usage to your functions, mostly unnoticeable. Think thunks like (1 + 2) that
    are subjected to demand analysis under optimization.

    2. 1-2% of them are serious enough to require profiling your code with
    cost-centres.
But that's pretty much the same as in C. The vast majority of memory leaks in C aren't fatal to the program. They just lead to a little bit of extra memory usage, mostly unnoticeable. And then you have the small fraction of memory leaks that draw the attention of the OOM-killer. A tacit admission that detecting code that is leaking memory in Haskell is no easier than detecting code that is leaking memory in C does not speak well for Haskell.

Memory leaks are a matter of correctness and reliability. Our computers are not ideal Turing machines. Their "tapes" are finite. Running out of memory causes the program to crash and produce incorrect results. Arguing that this only happens in a small fraction of cases, and can be handled with testing and profiling isn't persuasive, because one might say the same thing for a dynamically typed language, like Python.

1 comments

"Space leaks" are not "memory leaks".

A memory leak means a program will never free some region of memory; e.g. if it's pointer has been discarded without calling 'free'. That is certainly a matter of correctness. That is certainly a problem for finite-memory machines.

In constrast, a "space leak" is just a suboptimal usage of memory. As a classic example, we want the sum of a list of integers to fully evaluate the running total at each step, like this:

  sum(0, [1,2,3])
  sum(0+1, [2, 3])
  sum(1, [2, 3])
  sum(1+2, [3])
  sum(3, [3])
  sum(3+3, [])
  sum(6, [])
  6
However, lazy evaluation may avoid performing the additions right away; instead building up unevaluated 'thunks' (nullary functions), which only get evaluated at the end, like this:

  sum(0, [1,2,3])
  sum(0+1, [2, 3])
  sum((0+1)+2, [3])
  sum(((0+1)+2)+3, [])
  ((0+1)+2)+3
  (1+2)+3
  3+3
  6
This is a perfectly correct calculation; and everything has been 'cleaned up' at the end (no worries about 'infinite tapes', etc.). However, if we're trying to e.g. process a massive data stream from disk, these unevaluated thunks may quickly exhaust our available memory.
"A memory leak means a program will never free some region of memory"

The memory gets freed when the program terminates.

This is the memory management technique that many programs, such as GCC, used to very good results.