Jinx: a novel Linux deterministic multiprocessing debugger | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Jinx: a novel Linux deterministic multiprocessing debugger (corensic.com)
	20 points by Kaya 5876 days ago

5 comments

pshc 5876 days ago

Their debugger searches for bugs exhaustively over all non-equivalent thread timings, but I can find no mention (in the paper or otherwise) of whether this will blow up with a large number of threads or amount of control flow; even the illustration on their site screams "exponential!" to me.

Interesting paper from what I could glean, though.

EDIT: "Jinx dynamically builds a set of potential interleavings (i.e., alternate eventualities, or execution scenarios, that will occur under some future set of conditions) that are most likely to result in concurrency faults, and quickly tests those execution paths to surface concurrency problems including deadlocks, race conditions, and atomicity violations." So it's more selective than I first thought.

pgodman100 5876 days ago

Hi. Disclaimer: I work for Corensic.

We avoid exponential search space problems by using sampling, and curtailing of exploration. We choose what to explore based on research about where bugs are likely to lie. Exhaustive examination of all but fairly trivial problems is impossible for exactly this reason, and this is why we sample: rather than force users to change the way they write code, we deal with the way they have written code.

Thanks, Pete

stcredzero 5876 days ago

We avoid exponential search space problems by using sampling, and curtailing of exploration.

Basically, you use the same general strategies that are used to write Go playing programs.

pgodman100 5876 days ago

Yes, this is a lot like AI approaches used for playing games. We examine many and nested what-if scenarios involving reordering of memory communications to identify problematic sequences. Note that we operate at the memory level and so we don't have problems with particular OS or threading-package constructs. If your code can run in parallel, Jinx can explore it.

exit 5876 days ago

i'm very naive about what makes concurrency so difficult. am i right in thinking the core challenge is sharing memory resources between threads, and using shared memory to communicate between them?

can we imagine an abstraction layer now which would solve all our (concurrency) problems, but which would simply be too slow on current hardware to actually use?

samps 5876 days ago

While you're right that the main problem has to do with sharing resources between different threads of execution, the difficult part is not actually doing that sharing. The simple act of sharing data is very simple, and can be accomplished via many different helpful abstractions (try looking at Wikipedia's description of "shared memory" or "message passing"). In the case of shared memory, sharing can be accomplished just by writing data to memory in one thread and reading it in another. Easy!

The difficult part is in how the threads actually coordinate. The problem is extremely application-specific (what exactly do threads need to share? When do they need to share it? These cannot be answered in a general way). It's generally accepted that concurrency bugs (examples: data races (colloquially "race conditions"), deadlock, atomicity violations, locking discipline violations) are extremely difficult bugs. This is probably either because (1) programmers are not accustomed to thinking about coordinating between parallel activities or (2) people are in just worse at thinking concurrently than thinking sequentially.

So new libraries/methods for accomplishing communication between threads are always welcome and can help reduce the complexity of parallel programming. However, nobody has yet found an abstraction that both works for most kinds of parallel programs (MapReduce is very simple to work with but also very restrictive) and is simple enough for people to program in without fear of hard-to-solve concurrency bugs (message passing and shared memory are both quite general but considered somewhat unsafe).

So, the problem is not that a good abstraction layer would be too computationally expensive -- it's that no one even knows what the abstraction should be! Hope this makes the issue clearer.

exit 5876 days ago

Thanks, that does make a lot clearer.

It seems to me you and i are very comfortably coordinating sharing resources right now. In parallel, a browser is running on my computer, a browser is running on your computer, and an http server is running on the hn server.

But between us we're doing some collaborative. We are both contributing text out of which a single document is synthesized, and we might both have up voted this story, etc.

Our collaboration here is structured in terms of http requests/responses. Does this in itself address issues of "race conditions", "deadlocks", etc?

Can we imagine a future in which computation and memory are so abundant, we can virtualize this client/server paradigm for any collaborating parallel programs?

Or can we imagine a future in which there is no need to parallelize a large class of programs, because they will execute satisfyingly fast in a single thread?

samps 5876 days ago

Unfortunately, a client/server model still does not solve all of our concurrency problems. The problem is that the algorithm we're running--contributing a handful to text responses to the same repository--is very simple. But more complicated algorithms--say, if you're Google for instance, looking for a few words in billions of documents--need significantly more expertise to be correct (and perform well on top of that!).

It's certainly feasible to imagine that single-threaded performance will improve to the point that parallelism will no longer be "necessary" (although many people have observed that, with Moore's Law no longer yielding the performance improvements it did just a few years ago, this may be too far out). However, applications arise and expand to fill whatever performance we have available. By achieving parallel performance gains, we'll enable things that weren't possible before, no matter how good single-threaded performance is. Also important is the distinction between parallelism and concurrency: many domains need multiple threads for reasons that have nothing to do with performance! A database server, for instance, needs to service many requests concurrently; it can't function correctly in a "single-threaded" world.

eru 5876 days ago

You might want to have a look at Haskell's Software Transactional Memory (STM).

It solves all your concurrency problems in the same sense that garbage collection solves all your memory problems. I.e. you can still make a mess, but it's harder to do so, and it's less effort to build something that doesn't break.

johngunderman 5876 days ago

Yes. The main issue with concurrency is preventing threads from stepping on each-other's resources. Also, getting threads in sync can sometimes be an issue.

EDIT: Whoops, beaten to the punch. Methinks refreshing the page before posting might be a good idea :)

nkurz 5876 days ago

Are there any non-commercial tools that do similar things? I'm looking for a way to test a faster and smaller read-write lock implementation I'm playing with, but as it's a hobby project I can't afford to buy professional tools for it.

I'd be looking for something that would run on Linux. Open source would of course be nice, but free is probably more important to me in this case. Finished product will be open-source designed to replace pthread's rwlocks.

evadon 5876 days ago

If you haven't seen this. It's a must have.

hga 5876 days ago

It sounds good, I remember bookmarking the original research project.

I sure wish they'd provide an idea of the eventual cost, though. I can understand their not wanting to right now, but I at least can't afford to invest time into their beta program without knowing if I'll be able to afford it when they start charging for it.

CoolAssPuppy 5876 days ago

Disclaimer: I work for Corensic.

I just posted the pricing information on the site. We haven't finalized the actual price yet, but we expect to price this product in line with typical quality and load testing tools (think low four figures, USD). We will definitely offer substantial discounts to those customers who help us during our beta by submitting bugs or issues they find with Jinx itself, or by submitting bugs they've found in other software (their own or open source) using Jinx. Visit our "Report a Bug" page to give us feedback.

--Prashant

hga 5875 days ago

Thanks a lot for the reply and creating that web page (http://www.corensic.com/Products/PricingandLicensing.aspx).

As it turns out, the (reduced cost, I assume) non-commercial version will be just right for what I'm thinking of doing in this space. And of course I'll sing the praises of Jinx to the extent it helps me find the nasty types of bugs it's designed to.

Hmmm, one of the most interesting (at least to me) projects I'm thinking of in this space is a hypervisor based multi-threaded/core Appel-Ellis-Li garbage collector. It's based on marking pages unreadable to implement a read barrier and its preformance is very dependent on the speed of handing read traps, which is not something normal operating systems optimize since that's normally an error.

To allow concurrency, the GC itself has to live in the hypervisor or thereabouts, where it isn't barred from reading pages that are unreadable at the user level.

hga 5876 days ago

And Windows.

milod 5876 days ago

Also for Windows, you might be interested in Chess http://research.microsoft.com/en-us/projects/chess/