Hacker News new | ask | show | jobs
by dbaupp 3544 days ago
None of that looks like a formalisation or even a description of a data race, at least not in the modern sense.
1 comments

Im not a concurrency expert. Just had basic explanations and training common with other developers. How it was explained to me was two or more tasks trying to simultaneously access a shared resource for reading or writing. These accesses might not happen in the desired order, causing incorrectness. Then there were lock-related issues on top of that.

Hansens work formalized what I just described in terms of English, diagrams, and compiler checks. He started with sequential operations on private data in modules. He says if two or more share thd same private data they might not execute in desired order. The monitor pattern enforced user-specified order on function calls to shared data. Built-in to language & compiler.

If my description of race conditions is inaccurate or insufficient, I'd appreciate a link to one that you think is more accurate that I could use as a comparison point against the Hansen paper. Otherwise, his description of problems implementing concurrency sounds exactly like what I learned in books on multithreading, supercomputing, etc. Shared resource used in incorrect order due to concurrency.

Note: Also, his colleagues Dijkstra and Hoare were still inventing and developing formal verification at the time. Tooling sucked. Standard practice, like he did with Algol and COBOL compilers, was writing things like this in precise English with code examples or diagrams. Not sure if you were expecting a HOL model or something when you said "modern" but I figured Id mention stuff was primitive then.

I'm not looking for a formal mathematical model or anything, just a precise plain English description of a data race (not a race condition), e.g. like the following:

A data race is when

- two threads access a single memory location,

- at least one of which is a write, and

- at least one of which has no synchronisation^

(^ synchronisation in the sense of things like atomic instructions, not necessarily full locks.)

Alright that's clear. It's also in the first paper I gave you in the first illustration. The problem you're having is you've constrained the definition of concurrency or race conditions to only be about the language common among application programmers, esp threading. Concurrency is broader than that: it can apply to processes, threads, systems, protocols, or even hardware circuits. All that is required is two or more active things working with a shared resource in a way where operations might get interleaved and out of order in a way that makes computation incorrect.

Knowing that, Hansen starts on problem with RC4000 (1969) describing protecting communication between "processes" via message buffers, message queues, and checks performed on them to ensure validity. That and Dijkstra's critical regions are where the foundations were laid. His next step was the formalization the problem in 1972:

http://brinch-hansen.net/papers/1972a.pdf

Section's 1 and 2 cover basics of concurrent operation + race conditions. It was clear to me by the abstract but should be extra clear he's talking about them by this statement about Algorithm 1:

"The copying, output, and input of a record can now be executed concurrently. To simplify the argument, we will only consider cases in which these processes are arbitrarily interleaved but not overlapped in time. The erroneous concurrent statement can then be executed in six different ways with three possible results."

That's definitely a concurrency error. He then talks about mutual exclusion and synchronization with an await primitive. Closer to modern language but the focus on operating systems in the early 1970's means he says processes instead of threads and things like disk buffers or message queues instead of shared memory. Although his examples in 1972 paper are clearly shared memory since it's at algorithm level.

So, they discovered the problem around late 60's, implemented an early solution for sharing resources among processes in 1969, fully described race conditions + some other stuff by 1972, had a safer-by-design language to catch it at compile time by 1975, and applied that to implement a concurrent, production OS (Solo) for day-to-day use by academics by 1976.

So, race conditions were both formalized and initially solved in early to mid 1970's. I don't know how much more you need given they had an OS running 100 jobs/users at once interacting on shared resources without visible failures using their concurrency model. Mainstream OS's and apps are still having race conditions pop up on occasion. Clearly, what they were doing was addressing root cause if no race conditions occurred after a successful compile of "multiprogrammed" apps. :)

You're still talking about "race conditions" and "concurrency errors", whereas I am talking specifically about data races not arbitrary race conditions or concurrency errors. I don't see how the first illustration demonstrates anything about data races, nor how the rest of your comment applies to this: a data race isn't just plain old interleaving of executions/possible results.

A data race is when a program may give undefined results (in the sense of undefined behaviour in C) because you've violated core language semantics. Solving "all" concurrency problems is a much broader and far more restrictive paradigm than just ensuring that a program has defined behaviour, and not having restrictions is key to systems programming. (In fact, it's not even clear to me how one can "solve" race conditions at a language level without tying into some sort of machine-readable product specification or without just removing concurrency entirely: a single piece of non-determinism may be fine in one program but bad, i.e. a race condition, in another.)

> The problem you're having is you've constrained the definition of concurrency or race conditions to only be about the language common among application programmers, esp threading

No, I'm not having any problem here: you said yourself that you're not an expert. A data race (not arbitrary race condition) only makes sense if there's shared memory and thus thread is a perfectly reasonable description (although "thread" is also often used to just mean thread of execution, not literal pthread thread). It's becoming clear that we're talking about different things since you're not in-tune with the jargon/subtle terminology.

That is in fact where dispute came from: race conditions vs data races. I saw race and concurrency thinking you were talking about race conditions. My mistake.

Hmm. Ill have to look into the old stuff further to see about whether any of it, including Hansen, covers the accepted definition of data races. There's potential that Hansen's does but Im holding off until I think on it more.