Hacker News new | ask | show | jobs
by klodolph 4290 days ago
> This proved to be so rigid that an entirely new string type had to be introduced, and we're still dealing with the fallout.

There are a lot of things wrong with, say, Haskell '98 from the perspective of a modern Haskell programmer. Strings are one, but monads aren't applicative functors, it took us a long time to figure out how we wanted to write monad transformers, lazy I/O is terrible and we should use conduits or whatever instead. But you picked strings. This example does not help your point for the following reasons:

1. You can't just change the implementation of the string type without messing up someone's program. For a fantastic example, look at the recent change to Oracle's string type in Java. In theory, the interface is the same. In practice, it made a bunch of people mad.

2. You can encapsulate data in Haskell. Look at the "Text" data type, and ignore Text.Unsafe which exposes the gory innards. This is module level encapsulation, which is just as good as class-level encapsulation (better, actually, since it's more flexible). You could replace Text with a UTF-8 implementation or a UTF-32 implementation or some magic implementation that switches between the types, and you wouldn't break consumers of the Text interface.

> For example, try modifying a Haskell program to print a log every time a string is made. You can't!

This is a really contrived example. First of all, there is the question of whether you will need to create a string whenever you log something to a file, and presumably you wouldn't want to log those strings. Second, this is something you'd do with a debugger, you wouldn't actually do this to a program.

Besides, if you had access to the string implementation (which I'm assuming here is Text, because that's what most people use), you could just put some kind of unsafePerformIO call in front of uses of the Text constructor, and since the Text constructor isn't exported from the Text module, you're done.

2 comments

> 1. You can't just change the implementation of the string type without messing up someone's program.

Yeah, you can. 'NSString' is in fact a class cluster that provides different implementations/representations. Well, used to be on OS X, because it was changed to be a wrapper for a single CoreFoundation representation.

In GNUstep and Cocotron, I think they still use the older class-cluster implementation, and programs are portable between these implementations.

Polymorphism, baby :-)

Is there a good reason for why lazy I/O is terrible? It seems like the ideal solution for async-heavy programs
http://www.reddit.com/r/haskell/comments/1e8k3k/three_exampl...

Tekmo

I highly recommend reading these slides by Oleg:

http://okmij.org/ftp/Haskell/Iteratee/IterateeIO-talk-notes....

They are his old annotated talk notes and they give a really thorough description of real problems that lazy IO causes with lots of examples.

Edit: Here's a select quote from the talk:

> I can talk a lot how disturbingly, distressingly wrong lazy IO is theoretically, how it breaks all equational reasoning. Lazy IO entails either incorrect results or poor optimizations. But I won’t talk about theory. I stay on practical issues like resource management. We don’t know when a handle will be closed and the corresponding file descriptor, locks and other resources are disposed. We don’t know exactly when and in which part of the code the lazy stream is fully read: one can’t easily predict the evaluation order in a non-strict language. If the stream is not fully read, we have to rely on unreliable finalizers to close the handle. Running out of file handles or database connections is the routine problem with Lazy IO. Lazy IO makes error reporting impossible: any IO error counts as mere EOF. It becomes worse when we read from sockets or pipes. We have to be careful orchestrating reading and writing blocks to maintain handshaking and avoid deadlocks. We have to be careful to drain the pipe even if the processing finished before all input is consumed. Such precision of IO actions is impossible with lazy IO. It is not possible to mix Lazy IO with IO control, necessary in processing several HTTP requests on the same incoming connection, with select in-between. I have personally encountered all these problems. Leaking resources is an especially egregious and persistent problem. All the above problems frequently come up on Haskell mailing lists.

Oleg is a good guy to listen to.

Its harder to reason about resource usage with lazy IO. For example, when is it safe to call hclose to close a file handle?

     do
        f <- open "file.txt"
        s <- readContents f
        hclose f
        print s
Since readcontents is lazy, it only tries to get data from the file when you print s. But by that point the file has already boon closed!

If you think about it, its a bit similar to the tradeoffs between garbage collection and reference counting.