Hacker News new | ask | show | jobs
by EddieEngineers 1119 days ago
Why would we want to use purely functional data structures? When do the pros of functional data structures outweigh the additional complexity? Are there scenarios when a project would want to pivot from a regular data structure to a purely functional one?
7 comments

The most common point is that they're safe to share between threads making parallel algorithms easier to invent, understand, and implement correctly.

You can also safely re-use sub-structures without performing a deep copy. For example, if you want to keep a sub-tree around for later you can do that in O(1) time because it's safe to keep a reference to it. If it is a mutable tree you don't know what's going to happen to it so you need to do a deep copy of the entire sub-tree you're holding on to. This can save a lot on memory allocation and copying depending on your use case.

Deep Learning applications is one area.

Traditionally, OO code is written all the time.

But after I learned JAX/Flax, a light turned on inside my head and I now write functional Deep Learning code as much as I can. All my side projects and new code are purely functional in JAX/Flax. PyTorch has functional API known as functorch, and I have used it one project.

Where lots and lots of data in 3,4,5 dimensional tensors exist, and you need to run lots of transformation on them, and then you need to multiply a huge number of them thousand times in each second- functional code makes much more sense and gives immense sanity and peace of mind.

Those of you writing Deep Learning code, learn functional programming principles (immutable data, pure functions, leaving no side effect, etc.), and apply them to DL via functorch or JAX.

Your life will never be the same.

Do JAX and functorch have the same level of builtin functionalities (operations) as the original Pytorch library?

Where to learn about it other than the documentation?

JAX is very barebones and will require you to write much more code for the same task than you write in PyTorch.

Functorch is still new, and honestly, there is little to learn if you already know JAX. There are some talks from Meta, and then there is always the docs.

There's a tradeoff between the complexity of the implementation of a data structure, and the use of one. While the complexity of implementing purely functional data structures is often (except maybe in the case of singly linked lists) higher than their mutable counterparts, actually using them in a program is simpler and less error prone.

There are obviously other trade offs as well, like performance and memory usage.

they have some nice properties that you might want to benefit from. For example, they're persistent: every previous state of the data structure is still in there.. aka snapshots!
There is no additional programming complexity in using Scala's Map vs Java's HashMap. They are both maps with keys and values. The Java one you update in-place

    var m = new HashMap<K,V>()
    m.add(k, v)
the Scala one you "update" by creating new maps,

    var m = Map.empty[K, V]
    m += (key, value)
In practice it's mostly the same except you can share the immutable one around without being scared someone will mutate it, but it takes more memory per instance than the mutable version and creates more GC churn, which may or may not be an issue.
Do you use git? The git commit graph is a purely functional data structure.
As is explained in the abstract, purely functional data structures are most useful when you want to avoid assignment, either due to the restrictions of your programming environment (f.ex. you're programming in Haskell), or because that is a property you're interested in for other reasons (e.g. you want to ensure immutability due to concurrency concerns, or exploit persistence to improve performance).