Hacker News new | ask | show | jobs
by weff_ 2449 days ago
I'm trying to understand, what kind of scripting requires first-class concurrency that isn't fulfilled by say Python?
6 comments

For concurrency I think Raku has slightly more developed async support than Python, but the bigger advantage, I think, is in parallelism where, aside from the CPython GIL limiting practical parallelism in the main implementation (which is a big deal), Python as a language lacks the parallel iterables produced by the hyper (parallel but constrained to return in order) and race (parallel and not constrained in order) methods on the Iterable role in Raku, or anything like Raku’s hyperoperators that are semantically concurrent and parallelizable at the discretion of the optimizer. (Come to think of it, while also parallel, all those are also high-level concurrency constructs that Python lacks equivalents to.)

Python as a language can support parallelism via threads, and CPython as an implementation can via multiprocessing, but those are both very low level abstractions; Raku has higher level abstractions which allow much more succinct expression of parallel operations.

Sounds like multiprocessing.Pool.imap and imap_unordered? When dealing with io you also have async/await equivalent iterators like asyncio.as_completed().
Raku has `await` but it doesn't need `async`.

    sub foo ( Int \n where 1..100 ) {
      #  ___ start is a prefix operator that creates a Promise
      # V
      start { sleep n.rand; (1..n).pick }
    }

    my $result = foo(10);

    say $result.status; # Planned

    say await $result; # 6 (after a delay)



    # this is nearly identical to foo above
    sub bar ( Int \n where 1..100 ) {
      Promise.in( n.rand ).then({ (1..n).pick })
    }
(Note that `start` does not need a block unless you are grouping more than one statement as I have above.)

There are combinators like `Promise.allof` and `Promise.anyof`.

You usually don't need to use `Promise.allof`, because you can use `await` on a list.

    my @promises = Promise.in(1.rand) xx 10;

    my @results = await @promises;
(Note that the left side of `xx` is _thunky_ so there are ten different Promises with ten different random wait times.)

---

You should see the `react`/`supply` and `whenever` feature.

I am not that familiar with Python, but the GIL has long prevented any real in-process concurrency. Perl has concurrency but it's complex, heavy, and poorly supported. Raku's approach to this is built to avoid all these problems (like Elixir).
I agree the GIL is a problem but it's only an issue for CPU-bound problems. Is there really an important amount of CPU-bound work that is written in a scripting language? If it's CPU-bound, wouldn't you want to use something lower level?
If it's entirely CPU bound, you can use multiprocessing to negate most of the GIL issues, and transparently send inputs/outputs between the parent and child processes. If it's I/O bound, then AsyncIO is a great way to express asynchronous workflows. If it's a combination of both I/O and CPU bound workloads, there are ways to mix multiprocessing and AsyncIO to better saturate your cores without losing the simplicity or readability of Python: https://github.com/jreese/aiomultiprocess
Very true, I had not even thought about the multiprocessing package; it's sometimes not as convenient as multithreading but it'll get those other cores working.
Indeed and as a Perl developer I make use of XS/external libraries, cooperative multitasking (event loops/promises), and forking to cover these use cases. It doesn't preclude wanting the additional option to take advantage of threads in a useful way, since they do exist.
It's not just CPU bound problems, handling multiple overlapping i/o operations is more trouble than it ought to be.
Can you expand a bit on that? I'm not familiar with the issue you're describing.
Why should first-class concurrency needs be required to script in Elixir? This question seems to imply that Python is somehow a default language and special requirements must be needed to justify writing in something else. Elixir is general-use and pleasant to write scripts in so seems reasonable to me for someone to do so if that's their thing.
Ovid2 said Perl6 has a good concurrency model

lliamander replied that Elixir does too and it's worth a look

To that, 7thaccount replied that Perl6 and Elixir fill different niches.

So far, it seems Perl6 fills a niche that requires scripting and first-class concurrency. My question then is: what is this niche that requires very solid concurrency but also scripting. In other words, what does Perl6 have in terms of concurrency that Python does not (given they are both scripting languages)?

A fair question.

I don't know of many instances where scripting and concurrency would be needed in the same application. But if you wanted to use single language both for scripting tasks and applications that require concurrency, then Raku or Elixir would work.

One instance I can actually think of, that would be specific to Erlang/Elixir, is if you have a long running service that you sometimes run batch jobs against (like importing data from a CSV). An Elixir script can hook directly into an Elixir/Erlang service and interact with it using message passing. It's a heck of a lot simpler than RMI, and avoids the need to expose new HTTP endpoints specifically for such tasks.

Is that like the relationship between C# and PowerShell?
I think so, at least in some ways. I've shipped a project using PowerShell to script Windows Server/Hyper-V, and it was a pretty pleasant experience. Having a scripting language that not only does typical scripting stuff (wrangling text, etc.) and understands your application's objects is excellent.

Some differences:

* You can actually write your whole application in Elixir, whereas I could not see doing that with PowerShell

* In Erlang/Elixir, instead of objects you have processes. Think of your application as a microservice architecture on steroids, using the Erlang RPC protocol as a means for inter-process communication.

Because each process just sends and receives messages, your script can just send messages to any of the processes that make up your application, as if they were your own service. All you have to do is connect to the remote application using the Erlang Distribution Protocol (to connect different runtimes over the network).

I heard so much about the actor model, I should really try it in its intended glory one day.
Jonathan Worthington (Rakudo Compiler Architect). Keynote: Perl 6 Concurrency

https://www.youtube.com/watch?v=hGyzsviI48M

Does Python 3 have any operators that transform ordinary method calls into concurrent method calls? Perl 6/Raku does.
More to the point, does Python 3 have operators that transform sequential operations into operations that are both concurrent and parallelizable, and, in the case of iteration, provide control of parallelism parameters and whether results are constrained to be in-order or not.

To which the answer is “not only does Python not have them, but with the GIL CPython couldn't do much with them even if the language had them.”

Is that a bit like Go's `go` keyword?

edit: that is, as far as I can tell, after a quick Google, it's not too different from Python's Thread object.

The `start` statement prefix operator is a bit like Go's `go` keyword.

(In that when I translate Go into Raku, I usually replace `go` with `start`.)

    my Channel \input .= new;
    my Channel \output .= new;

    start for input -> \message { output.send(message) }

    start for output -> \message { say message }

    input.send(5);
But what he was really talking about is something more like the following.

    sub foo () {
      sleep 10;
      42
    }

    #                     ___
    #                    V   V
    my $delayed-result = start foo();

    say 'Hello World'; # Hello World

    say $delayed-result.status; # Planned

    say await $delayed-result; # 42
It is, python Threads are system threads.
Which is a similar response to the comments asking "why Perl over Python?" I ask, why not (both)?
Especially since you can use the Inline::Python module.
I'm embarrased to say I write scripts in node :P I used to write scripts in python but I've been writing so much JS that it's just easier for me in node. Plus, node defaults to locally installed deps so I don't have to deal with virutal environments.
Why do you think this is embarrassing? Javascript is great for small scripts. Largr codebases then sure
Well just from experience untangling async calls in python is a nightmare and sometimes hard to reason about. The red/blue function problem is real. Meanwhile dispatching concurrent long-running scripting tasks is basically trivial in elixir (Enum.map over Task.async, then Enum.map over Task.await)
I'm guessing because Python's concurrency relies on the Global Interpreter Lock. Although I think concurrent.futures might address that. Haven't worked with python concurrency libraries in a bit.
as I posted above:

I agree the GIL is a problem but it's only an issue for CPU-bound problems. Is there really an important amount of CPU-bound work that is written in a scripting language? If it's CPU-bound, wouldn't you want to use something lower level?

There is machine learning, which usually calls into numpy or other c extensions With a lot of the data preparation done in python
tldr; Using a scripting language that allows for native threads or has a strong concurrency model builtin to the core would be beneficial for any CPU bound scripting task...

-----

Python's concurrency model is good for waiting on network or disk I/O because of its GIL (Global Interpreter Lock): https://realpython.com/python-gil/#the-impact-on-multi-threa...

If your program is CPU bound the GIL will slow you down. I'm sure since the python ecosystem is quite vast there are ways around the GIL... but then you have to worry about every library you use not supporting "real" multi-threading, or (b)locking for no reason and shitting all over your efforts.

As I've posted above, I'm a bit confused by CPU-bound work being processed in a scripting language. If you're planning on doing intense CPU-bound work, maybe use a lower-level language? I'm not saying abandon Python: you can extend Python with C or just use IPC to transfer data between a Python front-end and a computation back-end.
I have a different perspective.

When I have a bit of Raku code that is too slow I complain (send a bug report) and then someone else figures out why the JIT isn't doing a better job and fixes it.

Then bingo-bango it ends up even faster than using NativeCall to call into the C code.

Of course there may be a delay before someone gets around to figuring it out; so in the meantime, NativeCall is actually very easy to use.

---

I would like to note that someone wrote Perl6/Raku code and the equivalent C/C++ code. They reported that the Perl6/Raku code was faster. (It was before Raku was brought up as an alias/name.)

I'm fairly sure that the reason the C/C++ code was slower is that it copied strings around rather than using pointers and indexes into other strings like MoarVM does.

At one time it was an obvious dichotomy that you would not use a scripting language for CPU bound work, but these days it is a much more blurry line. Partly because modern efficient languages are becoming ergonomic enough to work well as scripting languages while still giving you very good performance.

I actually love doing CPU bound work in Groovy which is usually described as a scripting language. But it gets converted straight to java byte code which is JIT'd and ends up as machine code. It only takes a few static typed hints here and there and it runs in the same league as C++ implementations. And it gets Java's excellent concurrency support for free.

I totally you feel you, I guess I thought your question was substantially more surface level than it was. My apologies.

I'm personally with you. I also don't tend to think object boxing is really the performance bottleneck for most applications, and if/when it is, likely the other requirements should've already ruled out using one (a scripting language).

It's like writing Nifs for Elixir, yeah sure you _can_, they have their purpose, but you could also just write another application to do that one thing and like you said, use IPC.

So in summary, we agree with each other, here's to:

the right tool for the job!

> the right tool for the job!

Hear, hear!