Hacker News new | ask | show | jobs
by freedomben 1416 days ago
I agree, and I think their move into ML is a genius move that is going to make elixir mainstream. I'm super excited for the future.
1 comments

I've wondered whether it's easier to add data analyst stuff to Elixir that Python seems to have, or add features to Python that Erlang (and by extension Elixir) provides out of the box.

By what I can see, if you want multiprocessing on Python in an easier way (let's say running async), you have to use something like ray core[0], then if you want multiple machines you need redis(?). Elixir/Erlang supports this out of the box.

Explorer[1] is an interesting approach, where it uses Rust via Rustler (Elixir library to call Rust code) and uses Polars as its dataframe library. I think Rustler needs to be reworked for this usecase, as it can be slow to return data. I made initial improvements which drastically improves encoding (https://github.com/elixir-nx/explorer/pull/282 and https://github.com/elixir-nx/explorer/pull/286, tldr 20+ seconds down to 3).

[0] https://github.com/ray-project/ray [1] https://github.com/elixir-nx/explorer

Definitely easier to add Pandas to Elixir than preemptively scheduled green threads to Python.
Plus sane failure domains: go has had preemptively scheduled green threads from day one but failure domains are really not a thing in go.
Hm, I think I've read somewhere that both Go and Elixir and kinda cooperative. A process in Elixir can yield control after certain number of reductions (function calls I think) and in Go a goroutine can only yield control on function calls, so if you have an infinite loop just adding numbers it will run uninterrupted. Both of them are "less cooperative" than Python with explicit yield statement. Do I get this right? I started digging into concurrency not that long ago.
Yes technically they are not fully preemptive in the sense that an os thread is (the os sends an interrupt which halts the processing at the CPU level), but in both go and elixir the programmer has no control over when the context switching happens, and "function calls" which are the yield boundaries happen all over the place, so it's "effectively preemptive".

Elixir is in practice more preemptive than go (last I checked with go) because you cannot infinitely loop and lock cpu in elixir -- a loop requires you to tail-call in elixir, so that's a yield boundary, and I'm certain that in earlier go that wasn't the case if you `while true {}`

Yes, unless you're in a BIF or NIF every call is a reduction: https://blog.stenmans.org/theBeamBook/#_reductions so if you call out to native code or are using a VM builtin you can lock it up indefinitely but other than that it's not possible.
[I have no experience with Elixir.] Would that be possible to have elixir intelligently manage multiprocessing of python scripts? I would be especially interested in being able to have the scripts talk to each other, but have no idea how it could work, besides perhaps having them communicate by writing to / reading from the same files, which seems risky.
There is actually a case study on elixir-lang.org of exactly this!

https://elixir-lang.org/blog/2021/01/13/orchestrating-comput...

Very informative - thanks!
I think you can have the scripts talk to each other through Elixir via the BEAM/OTP approach: for each script you will have a GenServer module that manages it in a separate process. Call this module MyApp.ScriptServer and put "use GenServer" at the top of the file. This GenServer process will have a client API, which you can write, which will be how you interact with your script. In the BEAM programming model, processes communicate by sending messages to each other's mailbox, so you set up some logic for this and maybe have another module called MyApp.ScriptCoordinationServer that manages some global state.

This is just my amateur guess -- I'd be happy to hear an Elixir/BEAM/OTP expert chime in.

Thanks for the explanation!
This is exactly the solution I was thinking of as well.