Hacker News new | ask | show | jobs
by tialaramex 1771 days ago
A sort is called "stable" if it doesn't re-order things that compare equal. Unstable sorts are sometimes faster. If you care your code should request a stable sort. If your library only offers one sort and doesn't specify whether it's stable, I agree with your "defence in depth" strategy but I believe the right long term fix is to have the library clear up whether or not the provided sort is stable.
2 comments

This may be a special case because the question comes up so often, but in general, I don’t think libraries should document all the qualities they don’t have, because that’s inexhaustible. In my opinion, if a sort function isn’t documented to do a stable sort, it must be assumed to be not stable, regardless of current implementation details. It’s often not such a big deal either: in many cases, there’s an artificial key (an id) that you can use as a tie-breaker in the comparison function.

One thing I hate is when people look at the library source code to figure this kind of thing out, since any implementation detail can change with an update. Assume the most hostile implementation allowed by the docs and you’re usually fine.

> One thing I hate is when people look at the library source code to figure this kind of thing out

In case you didn't already know: Hyrum's Law. https://www.hyrumslaw.com/ Even if the source code isn't provided and nothing is documented your users will rely on every observable nuance of your actual implementation anyway.

At Google's scale (internally, for their internal software where they could in principle fire people for doing this) this means if you change how the allocator works so that now putting ten foozles in the diddly triggers a fresh allocation instead of twenty, you blow up real production systems because although this behaviour was never documented somebody had measured, concluded ten doesn't allocate and designed their whole system around this belief and now South East Asia doesn't have working Google search with the new allocator behaviour...

In protocol design Hyrum's Law led to TLS 1.3 having to be spelled TLS 1.2 Resumption. If your client writes "Hi I want to speak TLS 1.3" the middleboxes freak out so nothing works. So instead every single TLS 1.3 connection you make is like, "Don't mind me, just a TLS 1.2 session resumption... also I know FlyCasualThisIsTLS1.3 and here is an ECDH key share for no reason" and the server goes "Yes, resuming your session, I too know FlyCasualThisIsTLS1.3 and here's another ECDH key share I'm just blurting out for no reason. Now, since we're just resuming a TLS 1.2 session everything else is encrypted" and idiot middleboxes go "Huh, TLS resumption, I never did really understand those, nothing to see here, carry on" and it works.

Thanks for reminding me of the term "stable sort". It was a contrived example. I suppose I could have abstracted even more:

Algorithm B sometimes fails on the output of Algorithm A. We can fix the issue by making B deal with that case, or we can change A so it never produces that output. Sometimes changing the algorithm in A just makes the problem go away, and maybe that was a good idea anyway. This seems too abstract, so I picked a slightly more specific (sorting) thing for A.