Hacker News new | ask | show | jobs
by johnnycerberus 1711 days ago
Isn't Java capable of the same thing now that it has algebraic data types with records + sealed classes + pattern matching? Given that Java already has a fine concurrency story, isn't OCaml a hard sell to someone that's not into compiler development to depend on its rich ecosystem?
3 comments

Though they share the word "algebraic", algebraic data types != algebraic effects. And while Java has good support for concurrency primitives and concurrent data structures, it does suffer from the problem highlighted in the article:

> Over time, the runtime system itself tends to become a complex, monolithic piece of software, with extensive use of locks, condition variables, timers, thread pools, and other arcana.

I'm not an expert on this, but my understanding is that the problem that algebraic effects tries to solve is to improve language semantics to make it easier to separate different levels of abstraction (e.g. separating the what from the how), while also encoding the performed effects into the type system.

Compared to Erlang, Haskell, an other FP languages, Java's concurrency story leaves a lot to be desired.

https://medium.com/traveloka-engineering/cooperative-vs-pree...

In fairness to the JVM, the work on Project Loom would bring the JVM inline with what that document describes as the Erlang/Haskell "Hybrid threading" model.
This has nothing to do with the JVM. Scala for example is already capable of exactly what Erlang/Haskell do. This is merely about the language Java, which lacks support for to make such a programming style ergonomic. You either need a language with very powerful type-system or a dynamically typed language. (or specific support for it, like in Go, but even in Go you are limited to what the language designers forsaw)

Project Loom will not change that, but it will improve performance for certain scenarios.

Project Loom is both about the JVM and the language Java. Most of the work for Loom AFAICT is at the JVM level, and the benefits are that all Java code will Just Work(TM) with the new primitives underneath them at the end.

Scala’s varied async/concurrency libraries are implemented in user land, and still use threads underneath. Mechanically, you must opt in to these and have to work to interop with code that might use other primitives. Scala can handle a lot of this complexity at compile time w/ types, but it’s not perfect, and certain runtime behaviors will always be out of scope.

Loom improves this by allowing any language that runs on the JVM (Java, Scala, Clojure) to opaquely use virtual threads to run their existing synchronous, scheduler-unaware code on the new Loom concurrency primitives implemented in the JVM. That’s powerful!

Everything you say is correct, but it can create a misunderstanding, so I want to elaborate for other readers:

Concurrency is not mainly about threads or performance, it is about program behavior semantics. Loom does not do anything about that, it "merely" improves performance. Well, you could say it actually makes semantics worse (you used (TM) for good reasons).

In that sense, I believe that Scala or any language with good concurrency semantics will benefit from Loom more than Java, unfortunately. But Java can of course still catch up on a language level. Even after such a long time, there are still new and interesting libraries (e.g. looking at JOOQ).

In OpenJDK, Java threads are just thin wrappers around OS threads and OS threads are a very precious resource; a modern OS can't support more than a few thousand active threads at a time.

I'm not sure how one would get there with the JVM's memory model. you'd need something like actors and a preemptive scheduler per core at the VM level with a share nothing state between actors/virtual threads. Erlang utilizes message passing and immutability to do this.

Which is precisely why Java language specifications doesn't state if they are green or red threads, they just happened to evolve into red threads across multiple implementations.

Project Loom is bringing green threads back, now officially as virtual threads.

Additionally there is java.util.concurrent and for those that care to really go deep enough, custom schedulers.

> I'm not sure how one would get there with the JVM's memory model. you'd need something like actors and a preemptive scheduler per core at the VM level with a share nothing state between actors/virtual threads.

Part of what Project Loom is doing is bringing lightweight usermode threads, called "Virtual Threads" and it's own scheduler.

Importantly you don't need share nothing state or immutability to add preemption, the JVM already has points during code execution where it knows the full state of the program. They call these "safepoints" and they're important for the GC to work properly. With the current implementation of Loom virtual threads are preempted when they do any blocking I/O or synchronization, but there's no reason why in the future they couldn't preempt them at any safepoint.

>In OpenJDK, Java threads are just thin wrappers around OS threads and OS threads are a very precious resource; a modern OS can't support more than a few thousand active threads at a time.

I don't know what you define as a few thousand active threads, but running the following C++ code let me run 70,000 threads before I got a resource error:

https://godbolt.org/z/74GsY1Kds

Erlang processes are not OS processes. They are implemented by the Erlang VM using a lightweight cooperative threading model (preemptive at the Erlang level, but under the control of a cooperatively scheduled runtime). This means that it is much cheaper to switch context, because they only switch at known, controlled points and therefore don't have to save the entire CPU state (normal, SSE and FPU registers, address space mapping, etc.). Erlang processes use dynamically allocated stacks, which start very small and grow as necessary. This permits the spawning of many hundreds of thousands — even millions — of Erlang processes without sucking up all available RAM. Erlang used to be single-threaded, meaning that there was no requirement to ensure thread-safety between processes. It now supports SMP, but the interaction between Erlang processes on the same scheduler/core is still very lightweight (there are separate run queues per core).
Algebraic data types are not the same as an algebraic effects system.
Java's checked exceptions can be considered an effect system and coupled with ADTs I suppose we can call it an algebraic effects system. I mean the domino effect in which the exception has to be handled in each method that calls that is what seems to me the effect/handler counterpart that is present in Java. In the end an algebraic effect is just an extension of a type system that supports ADT, or is my memory from university failing me now.
The defining feature of these effect systems is "resumable continuations". Essentially, at the point where you catch an exception, you have the option of resuming the code which threw the exception, and you can tell it how to proceed.

So, whereas exceptions only jump backwards in the stack, resuming a continuation sorta lets you jump forwards again, back to where you were. It's really powerful stuff.

You cannot resume a computation with exceptions. At most, exceptions are a subset of effects. Similarly, a lot of the issues with checked exceptions in Java come from the lack of exception polymorphism in the checked exception type system. Adding the two points, you cannot really call checked exceptions an effect system.
not to mention that exceptions as flow control is frowned upon in java, so while there are similarities, they are different in designation.