| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by flgr 3326 days ago

I think in order to really reap the benefits of this you'll want to actually reorganize the in-memory layout of the trees in order to make sure all elements needed for the comparison end up in the same cache line.

I haven't been following this closely, but the last time I checked scatter-gather loads were really really slow.

Chapter 3.3 from page 27 (PDF page 43) on of this be interesting: https://www.researchgate.net/profile/Florian_Gross/publicati...

Also contains a survey of some other related data structures and algorithms.

2 comments

cperciva 3326 days ago

As the saying goes, "memory is the new disk". Data structures like B-trees which were designed to improve the performance of data accesses from disk are now useful for improving the performance of algorithms in memory.

link

delhanty 3326 days ago

Does anybody have any data on whether external sorting [1] is the way to go these days for CPU based sorts?

That is, use quicksort for chunks up to the size of the L1/L2/L3 cache (depending on benchmarking) and then mergesort the chunks.

[1] https://en.wikipedia.org/wiki/External_sorting

link

adrianN 3326 days ago

You can have a look at the best sorting algorithms (for various definitions of "best") here http://sortbenchmark.org/

link

cperciva 3326 days ago

That's for big sorts. The question here is about sorting with a single CPU.

link

adrianN 3326 days ago

Look at the Joule category.

link

cperciva 3326 days ago

I don't have any data readily available, but it's generally understood that you want a cache-aware sort, yes. Usually I see people opt for samplesort rather than sorting chunks and merging, though.

link

delhanty 3326 days ago

Thanks for your opinion.

I did not know about samplesort - found it on Wikipedia.

link

0xfaded 3326 days ago

Thank you, scatter-gather was the google term I was looking for, though I couldn't immediately find any architectures that implement a vector scatter-gather. Are you aware of any?

link

pandaman 3326 days ago

Not sure what could be a scalar scatter/gather, but NV and AMD GPUs do the vector one (you can read write n values from n offsets with a single instruction).

link

wbl 3326 days ago

The Cray absolutely did. How else do you do sparse matrix multiply?

link