Hacker News new | ask | show | jobs
by dataflow 700 days ago
If you have a million 1-character strings and one string of length 1 million, how many steps would your LSD radix sort take? And (if it's indeed linear in the total input size like you say) how do you make it jump over the empty slots without losing real-world efficiency in other cases?
1 comments

It's MSB radix sort. I think LSB radix sort is not generally as useful because even for fixed-size inputs it often makes more passes over most of the input than MSB radix sort.

Your comment makes me think it would be swell to add a fast path for when the input range compares equal for the current byte or bytes though. In general radix sorts have some computation to spare (on commonly used CPUs, more computation may be performed per element during the histogramming pass without spending any additional time). Some cutting-edge radix sorts spend this spare computation to look for sorted subsequences, etc.

Oh I see. I've found LSD better in some cases, but maybe it's just my implementation. For MSD, do you get a performance hit from putting your stack on the heap (to avoid overflow on recursion)? I don't just mean the minor register vs. memory differences in the codegen, but also the cache misses & memory pressure due to potentially long input elements (and thus correspondingly large stack to manage).

> Some cutting-edge radix sorts

Where do you find these? :-)

> For MSD, do you get a performance hit from putting your stack on the heap (to avoid overflow on recursion)?

I'm not sure. My original C++ implementation which is for non-adversarial inputs puts the array containing histograms and indices on the heap but uses the stack for a handful of locals, so it would explode if you passed the wrong thing. The sort it's based on in the parallel-string-sorting repo works the same way. Oops!

> cache misses & memory pressure due to potentially long input elements (and thus correspondingly large stack)

I think this should be okay? The best of these sorts try to visit the actual strings infrequently. You'd achieve worst-case cache performance by passing a pretty large number of strings with a long, equal prefix. Ideally these strings would share the bottom ~16 bits of their address, so that they could evict each other from cache when you access them. See https://danluu.com/3c-conflict/

> Where do you find these? :-)

There's a list of cool sorts here https://github.com/scandum/quadsort?tab=readme-ov-file#varia... and they are often submitted to HN.

Yeah, what I hate about MSD is the stack explosion.

Otherwise - cool, thanks!

Hi, I want to respond to a post from you from 2019. (That 2019 thread no longer offers the reply button, otherwise I would reply there of course.) I apologize for using this thread to get my message in.

This is the item I want to respond to: https://news.ycombinator.com/item?id=19768492 When you took a Classical Mechanics course you were puzzled by the form of the Lagrangian: L = T - V

I have created a resource for the purpose of making application of calculus of variations in mechanics transparent. As part of that the form of the Lagrangian L=T-V is explained.

http://cleonis.nl/physics/phys256/calculus_variations.php

http://cleonis.nl/physics/phys256/energy_position_equation.p...

I recognize the 'you are certainly entitled to ask why' quote, it's from the book 'Classical Mechanics' by John Taylor.

Here's the thing: there is a good answer to the 'why' question. Once you know that answer things become transparent, and any wall is gone.

Haha wow, this was definitely random. Thank you for letting me know, I'll take a look when I have the chance.