| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Dylan16807 1104 days ago

Sometimes you have some very very high-use variables, with multiple threads each having one and writing to it often. If multiple of those variables are located on the same cache line, control of it will bounce wildly between cores and this can completely trash your performance. The fix there is absolutely trivial: add padding.

Sometimes you're iterating through a huge number of objects, and only using a couple fields out of each object. If you rearrange the data so each field is stored in a different arena, you can cut the number of memory accesses by a huge amount, and let prefetching work a lot better. That's usually not as trivial but it's simple work.

Another often-trivial one is avoiding linked lists whenever feasible.

2 comments

garfieldandthe 1103 days ago

> Sometimes you have some very very high-use variables, with multiple threads each having one and writing to it often.

And how often do you actually write such code? Most people: Between rarely and never. You write single threaded code or use synchronization primitives. Sure if you are writing a library squeezing performance out of some parallel processing problem then this is relevant, but that's a niche scenario.

> The fix there is absolutely trivial: add padding.

Most common case is that this just wastes memory. Don't micro-optimizr before you know that this is actually a problem.

> Another often-trivial one is avoiding linked lists whenever feasible.

Again, not true, depending on your use case. If the operations that you commonly perform on the data structure, like inserting/deleting elements, then of course you should use a linked list or whatever container data structure your language provides. How caches play into this is at most a second order effect in the common case, unless you really want to optimize a tight loop in a performance critical application.

I've seen too many prematurely applied fancy data structures where it turned out that all this extra complexity was entirely unnecessary and just made things harder to maintain.

link

Dylan16807 1103 days ago

> Most common case is that this just wastes memory. Don't micro-optimizr before you know that this is actually a problem.

You don't have many variables that fit the description I gave. You're already doing profiling if you can pick them out, and preemptively spacing them would barely take any memory, and it's the kind of problem that's hard to thoroughly test unless you have a 64 core machine sitting around.

> If the operations that you commonly perform on the data structure, like inserting/deleting elements, then of course you should use a linked list

For situations where linked list and array are both usable, then even if you very commonly insert and delete you're usually better off with a data structure that's built on top of fixed-size arrays. Iterating an array is so fast that it makes up for the cost of shifting around a surprisingly large number of elements.

> or whatever container data structure your language provides.

But which one? Languages tend to have a lot.

And the ones that give you a one-size-fits-all data structure usually don't have a built-in linked list anyway.

And I'm not suggesting anything notably fancy.

link

_a_a_a_ 1104 days ago

Avoid cache line ping-pong, don't waste cache, and make accesses as spatially predictable as possible. Okay, I can see where you're coming from, thanks.

link