Hacker News new | ask | show | jobs
by zodzedzi 1100 days ago
> ... heapsort, which is kind of slow, but prevents adversaries from smashing your stack.

How does heapsort protect against a stack attack?

2 comments

Oh, yeah. I forgot to quote that, because I was going to comment on it. It's an iterative, in-place algorithm, so there are no recursive calls to be made.

The bit I meant to comment on was the "kind of slow" part. It is true that heapsort tends to be slower than a well-implemented quicksort, but you don't use heapsort when you need the absolute best speed. The (IMO) best thing about heapsort is that its best case and worst case are the same order of magnitude, so sorting n things will take a fairly consistent amount of time, no matter what.

Heap sort can sort n elements with O(1) auxiliary data while quick sort (which is what libcxx usually relies on) in its worst-case performance would require storing O(n) stack frames. Since stack sizes are usually small, an adversary making you sort a million elements would likely cause a stack overflow.
Quicksort can be implemented iteratively, with O(1) auxiliary storage, as well: https://alienryderflex.com/quicksort/
The linked implementation uses logarithmic auxiliary storage but allocates extra storage such that the amount allocated is a constant and rejects inputs that are too large (inputs that wouldn't fit into any computer anyway). A similar trick can be used to convert any algorithm to "use constant space." Just allocate enough space to handle inputs of some large size and reject larger inputs.
Did you scroll all the way to the bottom? The "never fail" version that always sorts the smaller partition first, and allocates 300 "pseudo stack frames" will successfully sort any array on a real computer.

Sure, theoretically, it does what you say, but you know what they say, right? In theory, theory and practice are the same. In practice, not so much. And, these are real differences, too: theory idealizes real computers as general Turing machines, when, in fact, they're really only linear bounded automata: https://en.wikipedia.org/wiki/Linear_bounded_automaton

See also the commentary following the code:

> This might be slightly slower than the first one, but it will never fail because it always performs the smaller partition first. Hence, the only way it could run into the limit is if the to-be-sorted array was at least 2MAX_LEVELS elements in size. Since 2300 is greater than 1090, and there are only about 1080 fundamental particles in this universe from which a human-made computer can be built, no larger limit will ever be needed.

> (Note: Someone reminded me that a typically 64-bit index variable can index only 264 items, not 2300. That’s true, but if you’re using a 64-bit computer, you’re probably not going to have an array of more than 264 elements anyway, even if each element was only one byte.)