Hacker News new | ask | show | jobs
by devxpy 2863 days ago
> deal with serialization and communication

I thought a lot about this problem, for over 2 years, and came up with zproc

https://github.com/pycampers/zproc

Basically,

> It lets you do message passing parallelism without the effort of tedious wiring.

You'll be doing message passing without ever dealing with sockets!

Also, Shared memory parallelism is hard to get right irregardless of which language you use. I would recommend strongly against it, unless you're writing some really really really niche thing where message passing is a bottleneck (it isn't most of the time)

2 comments

The mantra that shared memory parallelism is hard to get right to the point where such platitudes as "unless you're writing some really really really niche thing" are uttered is entirely erroneous I find, through my own experience.

There are idiot-proof thread-safe datastructures and producer/consumer APIs that map extremely well to most problems that come up in practice in the domain, that one should confidently use. Refusing to do shared memory parallelism because of the _abstract potential for havoc_ rather than any practical justifications based on the problem-at-hand is throwing out the baby with the bathwater and is not the mark of competent engineering.

This talk (hopefully) conveys my point across

https://www.youtube.com/watch?v=9zinZmE3Ogk

You must be some sort of programming GOD, I guess.

The problem is that its _hard_ to get right.

For example - It's not trivial to use locks when you're working at an abstraction level higher than operating systems. Most people don't even realise there is a race in their application, because locks are inherently non-enforcing. Code written in locks is also really hard to read and reason by.

Message passing just makes it a little more trivial to avoid the pitfalls associated with parallel programming.

I also found that it lets you avoid busy waiting in certain places, which is always a performance advantage :)

Can you shed some light on those "idiot-proof thread-safe datastructures"?

I do concurrency in Java all the time with CompletableFuture and threadsafe data structures provided by various libraries, e.g. the Guava caches, and I rarely need to use locks or semaphores. It's a good set of abstractions that make concurrency pretty close to idiot-proof.

Futures in particular make it easy to write concurrent code close to the way you would write single-threaded code, because all of the threading is handled behind the scenes.

busy-waiting is a valid technique for some use-cases (and gives better performance in those situations) than other techniques.

Please research your topic.

Yes, but isn't it more CPU intensive?

(Speaking purely from experience. Don't have a fancy CS degree)

It uses 100% CPU, true but when the duration of the lock is extremely small (i.e. nanoseconds->microseconds) the total CPU usage is less than arranging for an OS level context-switch. In other words, you use it when synchronising with hardware or when implementing test-and-set primitives for higher level mechanisms. Crucially, the time that the lock is held for must be very short.

Given those restrictions and use cases you get a very efficient low latency locking mechanism.

you claim "To make utterly perfect MT programs (and I mean that literally)".

you've rediscovered message-passing... please take an elementary CS course on parallel systems.

That claim is naive in the extreme.

That's not my claim man, its written in the zguide

http://zguide.zeromq.org/page:all#Multithreading-with-ZeroMQ

Maybe I should've just linked it there,sorry!

Okay, I will take that course and get back, thanks for the suggestion.

P.S. You just implied Pieter Hintjens is naive. You have to live with that now :(

I think you took that claim out of context:

"By "perfect MT programs", I mean code that's easy to write and understand, that works with the same design approach in any programming language, and on any operating system, and that scales across any number of CPUs with zero wait states and no point of diminishing returns."

That doesn't mean to say its "perfect" or "solves" multithreading, just that its easy to write and understand and portable across architectures. That says nothing of how optimal it is for concurrency or parallelism ease-of-use wise or performance-wise, just that its 'easy'.

> That doesn't mean to say its "perfect" or "solves" multithreading, just that its easy to write and understand

Try saying that out loud?

yes. That makes perfect sense...

easy to write and understand is something completely different to correctness, robustness, scalability, etc. All those must be considered if you think you have 'solved' parallelism, but they are orthogonal to 'easy to understand'.

I don't think he meant it like that.

You could easily interpret that as -

Perfect _implies_ that it's easy to write and understand, but it's not the whole picture. It's just a feature that _he_ thinks is _crucial_ to it being perfect.

You get my point right?

Like sure, you could implemented a _perfect_, I don't know like gnome desktop in assembly language, but it wouldn't be easy to write and understand.

He thinks it's essential that it should be easy to read and write for it to be perfect.

Unfortunately, He's not with us now so can't even confirm :(