Hacker News new | ask | show | jobs
by javajosh 1156 days ago
The pedagogical downside of Haskell is that it ignores the physical reality of the machine. Physically, a computer is imperative, has mutating state, and is filled with all kinds of possible race conditions. Even after you apply the operating system, allowing processes to live together (and giving you space to define new ones), very few constraints are placed on your program and process space.

Instead of building on this reality, Haskell asserts that the starting point is not physical reality, but rather a mathematical formalism called "The Lambda Calculus", the physical machine is looked at with disdain and pity, its limitations to be worked around to provide the one true abstraction. This is the original sin of Haskell, because it is an attitude that isn't driven by a need to make a thing, but aesthetics and a peculiar intellectual dogma around building that ultimately becomes a stumbling block.

In my view, you have to respect the machine. Abstractions can be beautiful, but they are ephemeral, changeable, unreal. The danger is that these illusions become a siren song to makers who are always looking for better tools, and to these makers the abstractions become realer than the machine. Haskell's power users famously don't actually make anything with it (modulo pandoc and jekyll), and my guess is because either they find that 90% of real-world things you want to do are "ugly" from Haskell's point of view, and so are left as distasteful "exercises for the reader", or they get so distracted by the beauty of their tools they never finish.

In any event, Haskell is a road less traveled for good reason.

14 comments

I'm sorry to break it to you but every single programming language in existence ignores the physical reality of the machine. That's the point of abstractions such as programming languages.
There are abstractions which build on the inherently stateful nature of computers with their instruction pointers, registers, memory and peripheral devices, and there are abstractions which coerce you into framing any computational problem like a mathematical formalism.
Haskell's abstractions build on the inherently stateful nature of computers – how do you think Haskell compilers do their job?

(Not to mention that Haskell and its base libraries have plenty of abstractions useful also for the programmer to deal with the inherently stateful nature of computers, e.g. the IO type, STM, State (it's in the name!), Channels, etc.)

Yes, this is essentially it. There's a shape to the causal connections in the real machine that must be respected at higher levels of abstraction. In particular, the shape I mean is that basic mechanism of computation where you have a program counter, instructions and data in mutable memory (von Neumann), and a CPU with registers that "starts on the upper left" of memory, and leaves interesting shaped smears behind when it's done.

On top of this machine shape the OS adds a process abstraction, and a method to speak to devices. It is not coincidence that this process shape looks like the machine shape: lines of source correspond to instructions, declared structures correspond to main memory.... And from here we programmers pick a coordinate system and begin to build. But whatever coords we pick the space, the degrees-of-freedom, always the same: as vast as Turing could fathom. The interesting part of coordinate systems is the kinds of shapes you get for the constraints you picked. But Haskell seems to be a coordinate system with some valid constraint ideas (clear division between purity and side-effect, immutability), but an invalid sense of its identity as merely one coordinate system within this larger structure.

I've never had to worry about what's a register in Python, and barely about pointers and memory (those are highly abstracted away, exactly to the same extent as they are in Haskell).
These days even binary instructions ignore the reality of the physical hardware, as far as I know (I make a javascript lol). The output of e.g. assembler is an instruction set for a virtual machine that doesn't exist, that the CPU translates into actual execution. At least on the intel superscalar side, ARM may be a simpler setup.
Also true. Their point is that Haskell's system ignorance goes much deeper than its peers.
I mean yes, it's a higher level language. Python's system ignorance goes deeper than, say, Ada's, which in turn goes deeper than C++'s, which goes deeper than C's, which goes deeper than many of its predecessors, which go higher than x86, which goes deeper than PDP-11, which goes deeper than logic gates, which go deeper than transistors.

But what's the point? Which languages do we reject because they are sufficiently dissimilar to transistors? Should we all start writing code in VHDL?

(jokingly) yes, in Haskell: https://clash-lang.org
> Abstractions can be beautiful, but they are ephemeral, changeable, unreal

Where can we find 'no abstractions' these days? Even if you write in ASM, there will be tons of abstractions. Instructions will run out of order. Memory is abstracted. Even the ASM you write will be translated to microcode.

The closest you'll ever find to 'the physical reality of a machine' are microcontrollers (and even then, only some of them) and machines from the 80s. I have one sitting right next to me that I can tell you exactly how many cycles every CPU instruction takes.

Everything else is an abstraction. C abstracts a machine that doesn't exist(it was closer to machines that did exist at the time it was created). Even something as simple as a short circuit expression in your IF statement is an abstraction. Even in C you have to sometimes fight the abstractions when you are trying to, say, use caches effectively.

In a bunch of key ways Haskell is closer to the machine than modern languages like JavaScript: it doesn't depend on a complicated JIT system at runtime, it lets you explicitly control details like boxing and unboxing, it exposes various low-level C and machine types (fixed-size ints/etc), it has primops for SIMD...

You can write relatively low-level Haskell a lot more easily than you can write low-level JavaScript. You just don't have to.

I don't think that this is right. A programming language is useful to programmers if it's oriented around the structure of the _problem_ and not just the structure of the _physical machine_. For some tasks these coincide (especially if you care about performance) but I frequently find myself in situations where functional code is simple and the machine is irrelevant.
One of the more surprising aspects of GHC Haskell is that it is possible to write a very high level code with performance matching or exceeding code written in a low level language, thus honoring the machine. Stream fusion for an example. Not sure if there is any other language with higher abstraction/performance ratio.
JavaScript comes to mind. Its benchmarks are a wonderful testament to the immense engineering resources poured into V8.
V8 is very impressive. But in exchange for its speed, it needs more memory. JS code optimized for speed tends to use more memory with Node.js than optimized Haskell or OCaml code:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Unfortunately, one space leak (extremely east to accidentally create in Haskell, much harder to debug than in other languages) cancels out all those benefits.
I find so many things about this line of reasoning wrong that I don't know where to start. So let's just pick one thing: Haskell does not ignore the physical reality of the machine. It's one of few languages that explicitly recognise it.

There are more facilities in Haskell to deal with this reality than in almost any other language you can think of.

Do you think Haskell recognizes the physical reality of the machine more than C does? If so, how specifically does it do so?
Technically, within the language, yes. C itself simply delegates a lot of logic to the "physical machine"[1] by leaving it unspecified or implementation-defined. In contrast, Haskell actually tries to model these differences within the type system and standard libraries, with IORefs, STMs, and whathaveyou.

(In fact, I just checked the IORef documentation and it actually references the x86/64 architecture manual to explain some of the behaviour that can be expected. I would be surprised if any part of the C standard did that.)

[1]: I mean, if we're using an x86 derivative we're still talking about a very fancy PDP-11 emulator.

I see. The language spec explicitly talks about the machine. That's not nothing.

In practice, though, when I have some piece of memory-mapped hardware attached, and I want to talk to it, in C I can say:

  *(uint32_t*)0xF00BA4 = 0x0102ABCD;
or whatever I need to flip the bits. C lets me actually control the whole machine. Whereas Haskell... I don't know, but I suspect it lets me actually use the physical machine a lot less.
There might be a cleaner way of doing it, but

    do
      let ptr :: Ptr Word32 = nullPtr `plusPtr` 0xF00BA4
      poke ptr 0x0102ABCD
should have you covered.
> C lets me actually control the whole machine

Really? Can you run micro-ops?

You know you can write C inside Haskell, right?

Like, there's literally nothing stopping you. You can use FFI, and you can also write C inline.

You can have the best of both, if you want.

Well, in my defense I did offer a "line of reasoning" and not just a flat contradiction with no support.

Also, I'm sorry for any discomfort. To use an analogy, if your friend starts dating a girl that you know is bad for him, you can't just tell him that. You'll get punched. Especially early on when he's totally in love. It doesn't matter if you're right or wrong about her, there's no argument that is going to win against love, and to say anything ill of her is only going to cause pain and harm your relationship with your friend. And love is love, this applies to a person or a software tool.

I'm sorry for the discomfort, but I'm telling the truth as I see it and am not trying to hurt you. But Haskell, I think she's bad for you.

Your language should not care about the physical reality of the machine. That's the compiler's job, and the CPU microcode's job. And thankfully, every programming language ignores physical reality, including Assembly.

The goal of a programming language is to allow a human to express a sufficiently rigorous solution to a problem. From there, every step along the chain of execution is allowed to make 'unobservable' (for various definitions of the word) changes to execution. Your compiler might unroll your loops, or eliminate some unneeded intermediate variable, or even replace your entire function with a lookup table. Your CPU's microcode might do some weird fuckery with predictive execution. You shouldn't care, as long as the solution is, as far as you can observe, identical to your given one.

Whether functional programming is a better expression of computation than imperative programming is its own problem, but it's both silly and wrong to assert that imperative is better because it matches the behavior of the machine.

You can apply your argument to almost _any_ language. Haskell's semantics not matching the underlying machine has little to do with any of the issues in the article.

To the contrary, the simplicity of Haskell allows you to understand through simply rewriting expressions according to the rules/definitions you define. You don't have to worry about memory/effects/so many other things that have nothing to do with the _logic_ of what you are trying to do.

Of course, programs in reality often need to be changed to improve performance, but this isn't relevant when teaching.

Abstracting over the physical reality of the machine is part of the point. The physical reality of the machine isn't the focus in programming language design or theory, and it certainly isn't the focus of making maintainable code with properties like referential transparency, type correctness, and parallelizability. Abstractions, in short, allow us to make anything worth making.

The machine has no types. The machine has no variables. The machine has no functions, procedures, scoping, or information hiding. The machine has no assembly language. The machine has no machine code. The machine, ignoring the physical reality and focusing on an abstraction which could still potentially be in the realm of software and not physics, has a certain number of bits in flip-flops perturbed by other bits coming in on pins.

> Haskell's power users famously don't actually make anything with it (modulo pandoc and jekyll)

Self-contradiction is self-negation. You've destroyed your own argument, such as it was.

> Haskell's power users famously don't actually make anything with it

This is a lie that you're perpetuating.

Myself and many of my friends, colleagues, and associates make a living writing Haskell.

What kind of things do you use it for?
I write software for the reinsurance industry at Supercede[0].

[0]: https://supercede.com/

One of the benefits of the Curry-Howard isomorphism is that people like myself who never make anything useful can use computers too.
> Physically, a computer is imperative, has mutating state, and is filled with all kinds of possible race conditions.

Those are too difficult for compiler writers to reason about. While you're mutating the finite set of registers in your high-level C code - just like a real computer does - clang is swapping those out for operations on an infinite number of immutable registers.

>"The Lambda Calculus", the physical machine is looked at with disdain and pity, its limitations to be worked around to provide the one true abstraction.

That isn't true. There are graph reduction machines whose natural model of computation is lambda calculus and they are generally very efficient compared to sequential processors implementing Turing machines.

Screw the machine. As long as you can transform one formalism to another, why encumber the human mind with needlessly complicated ones?