Hacker News new | ask | show | jobs
by leephillips 2084 days ago
5. Once created, objects and collections must be immutable.

So this language would not be general purpose, as it would not be suitable for high-performance computing.

Large scale simulations almost always involve arrays that are modified in place. Being able to somehow declare a collection to be immutable would be highly useful, but not having the option of mutable collections limits the kinds of problems that can be approached with the language.

5 comments

I'm not going to claim that mutability is never useful for performance, but many large scale simulations can be expressed quite elegantly using bulk operations on arrays or other structures, with no mutability in sight. Both particle simulations a la n-body and stencil operations are in this category. An efficient low-level implementation of such bulk operations involves mutable updates, just like any functional language is compiled to "impure" assembly code, but the programming model used for application programming can remain pure.
Interesting. Can you explain, with a somewhat simple example, how this can be efficiently implemented, or at all? I mean preserving the appearance of immutability at the source language level, while mutating the original structure under the hood for performance.
Any vectorised operation in Numpy is an example of this. The pure subset of Numpy can be used to write useful programs, but the Numpy functions/methods are mostly implemented in impure C.

Another example is completely pure array programming such as in Accelerate[0] or Futhark[1].

[0]: http://www.acceleratehs.org/

[1]: https://futhark-lang.org

A somewhat related idea is called "benign effects." The idea is that you write code with an immutable interface that uses mutation in its implementation.

So there are "effects" (non-functional state changes) that are encapsulated ("benign").

I learned this term in reference to Standard ML at CMU.

This is different from what you're asking because it isn't a compiler optimization and it isn't actually checked by the language at all, but it works pretty well in practice.

It's like unsafe in Rust: you write most of your code assuming a useful property that you then break in the small percentage of code that needs to break it.

Not very knowledgable on this myself, unfortunately, but I believe that in graphics programming, shaders written in GLSL often take the form of a series of functional, mathematical transformations of vertices. Those transforms are run in the GPU as highly parallelized array operations, probably using a lot of mutable state. But those details are mostly hidden from the shader programmer.
C++ supports this via the mutable keyword https://stackoverflow.com/questions/105014/does-the-mutable-... though not particularly for performance purposes.
Thanks to all who replied.
I concur, I should have been more precise in my comment.
Isn’t it possible (at least in theory) to make mutability an implementation detail of the compiler/runtime? Rust’s borrow checker approaches this, but the abstraction leaky or nonexistent. Additionally, many high performance computing applications (e.g. Tensorflow) abstract away expensive mutable operations, so at least in theory, it should be possible to isolate mutability to small segments of code where mutability is opt-in.
Yes, Haskell as a pure functional language does this too. A naive copy-by-value handling of lists will usually end up in the same order of magnitude for performance as mutate-in-place linked lists in C. The compiler can track those immutable values and just mutate them in place, when it can guarantee that's a safe operation. The vast majority of the time, you can get away with just copying a pointer or renaming, not the whole variable.

The caveat is that, in my experience, it's a fair bit harder to reason about performance, as the execution model is even more abstracted away from the hardware than even something like the C model is (which is no longer a good fit either, in this era of speculative execution and multi-level caches.)

> The caveat is that, in my experience, it's a fair bit harder to reason about performance, as the execution model is even more abstracted away from the hardware than even something like the C model is (which is no longer a good fit either, in this era of speculative execution and multi-level caches.)

One solution is to have a tool developed and distributed along with the compiler (so it can never fall out of sync with the compiler, that's why) annotate the code with notes about performance.

I think if performance is part of the requirements of your code, then performance must be a part of your type signature.

For example, a tail-recursive function needs to have it’s type as tail-recursive.

This is where linear types and in general quantitative type theory comes into play. Also eagerness / laziness annotations.

Tail recursion is not necessary to annotate imo, but I guess the compiler/linter could maybe complain if it finds recursion it can't do a tail call optimisation for. These kinds of warnings are similar to mutable languages warning about things that are probably bad but sometimes necessary.

It’s neccessary to annotate tail recursion because you are making it clear to the compiler that your initial assumption about the performance of this function is that it will not explode the stack.

The reason it must be made explicit is because when somebody else comes later on to change that function they may miss the fact that it doesn’t explode only because it’s tail-recursive.

You could of course document the requirement - but why document if you can make it a compiler option? “I don’t want this to compile unless I get the behaviour I expect from it”.

Also as far as I am aware C-style functions can not be tail-recursive because they can not clean up the stack after themselves, thus you can’t support tail-recursion across FFI.

Rust's im[1] and rpds[2] crates are refcounted pointers to immutable data structures, but support mutable operations on &mut instances. When an instance is cloned, it merely creates another pointer. When an instance is modified, it uses Arc::make_mut() to only clone each tree node if it has other users. This approach has runtime overhead, but makes nested updates (foo[0][0].attr = 1) as simple as mutable structures.

This somewhat resembles immer.js (uses a proxy around an immutable structure which records updates). Contrast this approach to Clojure transients (whose children don't magically become transient), and whatever Haskell does (https://news.ycombinator.com/item?id=24740384).

[1]: https://docs.rs/im/

[2]: https://docs.rs/rpds/

Linear types fix this problem, by letting you prove to the compiler that logically immutable operations can be implemented as in-place updates.
Mutability is an abstraction, it doesn't forbid in place modification of data. What it forbids is other code accessing data that holds references the array prior to the modification, which creates a logical error.
F# and OCaml have mutable arrays.