Hacker News new | ask | show | jobs
by akhilcacharya 3357 days ago
>After using it for a couple of weeks I am confused why ML/OCaml aren't more popular

For me, the issue is the GIL, although that is being worked on as we speak.

5 comments

Sadly, I'm starting to lose my hope over multicore. It's been in the works for as long as I've used the language with much speculation of it being "almost done" or "in the next release" that have failed to materialize. That said, I don't have the time to really follow along with the compiler's development so take that with a grain of salt.
Precisely this is the reason I've lost hope with regards to OCaml as a main language. Multicore, specifically, is one of those things that has been "close" for so long that it's looking like a Duke Nuke'em Forever feature. I keep track of OCaml regularly because I love the language and I find that it's probably the best combination of features in one place, both in terms of my enjoyment as a programmer and in terms of the performance of the language.

I've come to realize later, though, that there's one glaring problem: Runtime inspection really is much more valuable to me than clean syntax and programmer comfortability when I'm building a system. With that in mind, maybe it'll never be very interesting to build a bigger system in a language like OCaml, even though it offers you a lot in terms of programmer efficiency as well as high performance.

Maybe it's better to glue together OCaml programs on the BEAM (The Erlang VM) so that you're able to orchestrate them and introspect them and get proper handling and oversight of the different components of your system. Maybe all that makes multicore in OCaml mostly pointless for you in practise.

Of course, not everyone will use the BEAM and OCaml together, so for them it matters a lot. I've come to see it as a worrying sign of a community that doesn't care to evolve enough, more than anything, and that's why OCaml is not as interesting as it could be.

It should be said that languages that supposedly are made for building systems also lack this runtime introspection and oversight. They have no way to locate and refer to their threads and no way to automatically handle their successes and failures. These languages are wildly popular, and so none of the above apparently matters to the vast majority of people. I think it's an interesting argument around (or against?) multicore in OCaml.

Good news, there is indeed Alpaca: ML for the BEAM.

https://github.com/alpaca-lang/alpaca

I always found that strange since Python has the same GIL and it hasn't stopped the massive adoption, same with MRI Ruby (though one could argue JRuby is more popular, but don't see any massively multicore applications in Ruby either), so that reason does not convince me.
Now that JaneStreet and Facebook are investing so much in the language, I think this feature is more likely to be implemented then ever.
What does the lack of multi core keep you from doing?
In web-apps architectures, you have a choice: multithread or multiprocess. Multithreading is far more memory friendly, and occasionally a bit faster, but many interpreted language runtimes aren't prepared for multithreading. Python / OCaml have a big global lock on certain things (interpreter, garbage collection) that are needed frequently enough that even requests aren't really interfering at the DB transactional level will block one another under multithreading. So instead you have to take the hit of multiprocessing, where each web worker has its own memory space and set of locks.

So if you're going to justify writing a webapp in OCaml, it would be helpful if the language was more efficient than whatever your alternative is.

> In web-apps architectures, you have a choice: multithread or multiprocess.

No, the choice is definitely not that limited. Saying something like this is like saying there's no Nginx, only Apache available as an HTTP server. For the concurrent, IO-bound code you can leverage all kinds of concurrency approaches (coroutines, green threads, callbacks). In these use cases, multiple threads are actually a bad idea from the memory efficiency perspective.

On the other hand, multiple threads would be viable for CPU-bound and/or long-running code were it not for the GIL, that's true. With the GIL you don't have that option and have to resort to multiprocessing.

Multiprocessing is not so bad, actually, although it does make the code more complicated. Unless your problem is massively parallel (but then you'd use GPU instead), spawning n x 2 processes for n cores is definitely possible with how much RAM is available nowadays on the servers. You get optimal parallelism at the cost of serialization overhead (or other complexities if you want to directly share memory).

There are some languages which do support most existing concurrency mechanisms and they may be a better fit for a particular project. However, not supporting parallelism via multi-threading hardly disqualifies any language, provided the other tools are in place, solid and widely used.

CML/MLton or Manticore will deal with most of those issues.