Hacker News new | ask | show | jobs
by rtfm666 4399 days ago
ok, I read the preface and while I was blasted out by the pretentiousness of the final line, I am interested about the content. However, I can't seem to get an overall 'what the hell is this useful for?' summary. The first part of the preface is all just abstract 'you will become better'.

Can someone please enlighten me? I think Hoare is great and would like to know if this stuff would benefit me as a programmer.

4 comments

One direct influence that CSP has had was on the design of the concurrency features of Go:

http://golang.org/doc/faq#csp

In general, things like CSP and related formalisms like Petri Nets are used to build mathematical models of systems so that properties of these systems can be proved. However, they have also influenced a lot of real-world software designs - I've worked on software that controlled large scale industrial systems (cement mills) that used Petri Nets both to represent processes but also to prove properties of these processes.

A watered down version of Petri Nets also formed the basis for UML Activity Digrams - if you like that kind of thing. CSP also influenced the design of the Transputer and the Occam programming language.

[NB People using the bars in Activity Diagrams as a means of joining lines up rather that as their actual meaning (starting and stopping concurrent activity) drives me round the bend, but most of UML does that...]

Edit: I guess it's an interesting question as to when actually knowing CS formalisms like CSP actually helps. I guess my response would be: lots of real world systems we need to analyse, control and/or build exhibit lots of complex behaviour. What's our main tools for modelling complex domains: maths. CSP is just one neat way of applying mathematical reasoning (either to structure understanding or to actually prove stuff) to concurrency. Whether you need to know this or not to be a good developer is another question - 99% probably don't.

Edit2: Note that just because I have a superficial familiarity with these concepts doesn't mean I would label myself a "good developer"!

Distributed systems are really hard. This line has been endlessly parroted and, also, endlessly reproven. They are really hard.

The major reason why they're hard is that your standard reasoning toolbox is insufficient. Understanding distributed systems means being aware of a much larger state space and how things like special relativity impact it (I'm being a bit flippant, but not actually inaccurate—message latency is not poorly modeled by special relativity). Most programmers use reasoning methods like operational modeling (single threaded), guess-and-check, and unit testing. These are simply completely insufficient.

If you want to model full distributed systems you might want to use something like algebraic topology. It's a useful tool for understanding the entire state space of a system and its valid transitions. It can be used to prove Paxos or Raft are correct. https://www.ideals.illinois.edu/bitstream/handle/2142/33762/...

Another method is to simplify a whole lot what a "distributed system" is and hope to find a more analyzable model where more standard reasoning tools work. CSP is one of these. It's a simpler model where some of the reasoning tools of regular formal languages can be used to understand its operation.

The upshot is that CSP won't be as powerful as you could imagine possible when building a distributed system, but it'll be massively more manageable so long as its sufficient for your goals. It's, as such, formed the basis for some popular distributed tools like Erlang.

> Understanding distributed systems means being aware of a much larger state space and how things like special relativity impact it (I'm being a bit flippant, but not actually inaccurate—message latency is not poorly modeled by special relativity).

This sound really interesting. Any recommendations to learn more about it?

The classic paper on distributed clocks mentions it, Lamport's "Time, Clocks, and the Ordering of Events in a Distributed System" (http://research.microsoft.com/en-us/um/people/lamport/pubs/t...)
You reminded me of this - http://www.informit.com/articles/article.aspx?p=1768317

There are some interesting differences between CSP and Erlang (and Go, which arguably implements CSP a bit more closely than Erlang).

Yeah, to be clear: pure implementations of CSP are probably useless but "near" implementations are abound. I don't know Go, so Erlang came to mind... But it's definitely a little way from pure CSP.
I suspect most "pure" models are nigh useless on their own. They all require a bit of compromise to interact with the world, or exist as a subset of some other language/tool. However, they can be awesome for describing in specifications or analyses of systems. And when languages support them in some fashion (even if impure) it allows us to reduce the semantic distance between specification/requiremnets and implementation to a point where we can be confident that we've done things correctly.
Haskell is an interesting playground for this since it tries so hard to make "pure" practical. Interestingly, they achieve this to a large degree using monads. I think that idea is fairly generalizable, though, that pure formalisms can embed impure ones as models and then have those be executed by an RTS.
One way of testing/verifying concurrent programs is to build a model of the program, and then see if the model satisfies certain properties. This is fairly tedious and does not scale very well. However, while it wont prove anything, it can be useful to build some kind of model; it can often point out some flaws in the design that you missed.

All in all, I think its sort of like a programmer learning the basics of discrete math. You're not actually learning how to program but it can change the way you think. Obviously, if you go to deep (which perhaps CSP is) you might get a bit off track.

It is also used in evolutionary biology to model evolving and cooperating agents.