Hacker News new | ask | show | jobs
by liblfds 3674 days ago
I think you mean Treiber, right? typo with the "n".

In 1986, on an IBM mainframe, published a white paper describing how to implement a lock-free stack (which is also a freelist, of course). It was the first lock-free data structure. I'm not quite sure why it's being called an algorithm!

The scalable version described in that paper is scalable by dint of adding an elimination layer in front of the stack.

I've added such a thing, although my own design - I've not read their paper, so I don't know how they did theirs - to the freelist in 7.1.0, but it hasn't worked well. It roughly doubled performance on x64, when there are a large number of cores, but reduces performance for lower numbers of cores and on other platforms. When there are a large number of cores, performance is rock bottom,, so doubling means going from like 1m ops/sec/core to 2m ops/sec/core, when if you run on just one core, you'd be doing say 30mm ops/sec on that core.

I'd expect it in principle to permit real scalability - fairly linear performance as the number of cores rises - so I think I've missed something, or done something wrong.

1 comments

Oh mea culpa, yes that's a typo. I was referring to W.K Treiber and his stack implementation. I think the fact that he referred to it as an "algorithm" is the source of my confusion.

Is it safe to say then that distinctions such as "lock free" and "wait free" always refer to data structures and not algorithms then?

I was curious, could you elaborate on why you think that PRNGs satisfy the property of a lock free algorithm?

Thank you for your detailed response by the way.

> Is it safe to say then that distinctions such as "lock free" and "wait free" always refer to data structures and not algorithms then?

Well, for what my thoughts are worth, I would say no, not inherently and catagorically, but in practise, it's hard to imagine algorithms which are multi-threaded in such a way that threads contend on shared data, such that the design of the algorithm could even be lock-free or wait-free.

Parallelized (i.e. multi-threaded) algorithms tend I think to have multiple threads all running the same code but each being given a separate part of the overall data to process - say an image processing algorithm, where each thread receives an equal portion of the image to process.

A PRNG however I think can (for one or two types of PRNG only, not in general) be implemented as a lock-free algorithm. A PRNG is not a data structure - rather, it maintains some internal state and on requests uses (and updates) that state to generate a new random number. It is an algorithm.

The updating process must be atomic or it is compromised - you can't have one thread half way through the state updates, and then another thread jumping in at that point and using the half-updated internal state.

Good PRNGs hold a lot of internal state - up to a kilobytes - so it's not possible to directly use atomic operation to update that state. However, there is a class of PRNGs called "SplitMix" which have an 8 byte state, where on request for a new random number, that state has a particular (quite large) constant added to it, and the result is then passed through a hashing function.

With this design, an atomic add can be used for the addition, and once that's done, the requesting thread then local to itself runs the hashing function (it's single-threaded, and merely takes an 8 byte number for input), and so the design is lock-free - no matter how many threads are using the PRNG, whether or not they have been idled by the OS, all other threads can still get their work done.

This PRNG is a curiosity though - it has almost no genuine use cases. The performance is poor when compared to the single-threaded version of the same PRNG, and a PRNG being what it is, each thread can easily run its own, thread-local PRNG.

Thanks for your detailed response. You mentioned "SplitMix" PRNGs, what is the second one that is implemented lock free?

By the way congrats on getting your release out. This looks like good stuff.

Make that "one" then :-)

I don't actually know know of others, but I know there are a range of other PRNGs which run on single-word states, so they might be viable.

Thankyou! I see many flaws in the library though - there's so much that I can see to improve or to add. Still, with such things, there's a lot of value in making a release to get new functionality out so it can actually be used, even if there's much else to be improved, or which is not as it could or should be.