Hacker News new | ask | show | jobs
by armitron 2863 days ago
They did not, which is why this "course" illustrates taking advantage of multiple cores via multiprocessing without mentioning the GIL at all. Which is a little misleading if you think about it.

Also, by having the introductory chapter be about "functional programming" (which incidentally Python does not do well), he completely bypasses the serious issue of shared state.

Which goes to show that parallelism in Python is more like a gimmick than a real-world solution since it doesn't let you do in-process shared-memory processing via threads in parallel which is so important for many applications. In my case, the vast majority of the time I do not want to farm workers out to different operating system processes and deal with serialization and communication, but this is the only way for Python code to take advantage of multiple cores [1].

[1] Another way is to write a module in C and have Python code call into it on a new thread and release the GIL while doing so, but of course this is even worse pain-wise than doing it with multiprocessing and you end up writing/compiling C.

1 comments

> deal with serialization and communication

I thought a lot about this problem, for over 2 years, and came up with zproc

https://github.com/pycampers/zproc

Basically,

> It lets you do message passing parallelism without the effort of tedious wiring.

You'll be doing message passing without ever dealing with sockets!

Also, Shared memory parallelism is hard to get right irregardless of which language you use. I would recommend strongly against it, unless you're writing some really really really niche thing where message passing is a bottleneck (it isn't most of the time)

The mantra that shared memory parallelism is hard to get right to the point where such platitudes as "unless you're writing some really really really niche thing" are uttered is entirely erroneous I find, through my own experience.

There are idiot-proof thread-safe datastructures and producer/consumer APIs that map extremely well to most problems that come up in practice in the domain, that one should confidently use. Refusing to do shared memory parallelism because of the _abstract potential for havoc_ rather than any practical justifications based on the problem-at-hand is throwing out the baby with the bathwater and is not the mark of competent engineering.

This talk (hopefully) conveys my point across

https://www.youtube.com/watch?v=9zinZmE3Ogk

You must be some sort of programming GOD, I guess.

The problem is that its _hard_ to get right.

For example - It's not trivial to use locks when you're working at an abstraction level higher than operating systems. Most people don't even realise there is a race in their application, because locks are inherently non-enforcing. Code written in locks is also really hard to read and reason by.

Message passing just makes it a little more trivial to avoid the pitfalls associated with parallel programming.

I also found that it lets you avoid busy waiting in certain places, which is always a performance advantage :)

Can you shed some light on those "idiot-proof thread-safe datastructures"?

I do concurrency in Java all the time with CompletableFuture and threadsafe data structures provided by various libraries, e.g. the Guava caches, and I rarely need to use locks or semaphores. It's a good set of abstractions that make concurrency pretty close to idiot-proof.

Futures in particular make it easy to write concurrent code close to the way you would write single-threaded code, because all of the threading is handled behind the scenes.

busy-waiting is a valid technique for some use-cases (and gives better performance in those situations) than other techniques.

Please research your topic.

Yes, but isn't it more CPU intensive?

(Speaking purely from experience. Don't have a fancy CS degree)

It uses 100% CPU, true but when the duration of the lock is extremely small (i.e. nanoseconds->microseconds) the total CPU usage is less than arranging for an OS level context-switch. In other words, you use it when synchronising with hardware or when implementing test-and-set primitives for higher level mechanisms. Crucially, the time that the lock is held for must be very short.

Given those restrictions and use cases you get a very efficient low latency locking mechanism.

you claim "To make utterly perfect MT programs (and I mean that literally)".

you've rediscovered message-passing... please take an elementary CS course on parallel systems.

That claim is naive in the extreme.

That's not my claim man, its written in the zguide

http://zguide.zeromq.org/page:all#Multithreading-with-ZeroMQ

Maybe I should've just linked it there,sorry!

Okay, I will take that course and get back, thanks for the suggestion.

P.S. You just implied Pieter Hintjens is naive. You have to live with that now :(

I think you took that claim out of context:

"By "perfect MT programs", I mean code that's easy to write and understand, that works with the same design approach in any programming language, and on any operating system, and that scales across any number of CPUs with zero wait states and no point of diminishing returns."

That doesn't mean to say its "perfect" or "solves" multithreading, just that its easy to write and understand and portable across architectures. That says nothing of how optimal it is for concurrency or parallelism ease-of-use wise or performance-wise, just that its 'easy'.

> That doesn't mean to say its "perfect" or "solves" multithreading, just that its easy to write and understand

Try saying that out loud?

yes. That makes perfect sense...

easy to write and understand is something completely different to correctness, robustness, scalability, etc. All those must be considered if you think you have 'solved' parallelism, but they are orthogonal to 'easy to understand'.