|
|
|
|
|
by Dylan16807
1104 days ago
|
|
Sometimes you have some very very high-use variables, with multiple threads each having one and writing to it often. If multiple of those variables are located on the same cache line, control of it will bounce wildly between cores and this can completely trash your performance. The fix there is absolutely trivial: add padding. Sometimes you're iterating through a huge number of objects, and only using a couple fields out of each object. If you rearrange the data so each field is stored in a different arena, you can cut the number of memory accesses by a huge amount, and let prefetching work a lot better. That's usually not as trivial but it's simple work. Another often-trivial one is avoiding linked lists whenever feasible. |
|
And how often do you actually write such code? Most people: Between rarely and never. You write single threaded code or use synchronization primitives. Sure if you are writing a library squeezing performance out of some parallel processing problem then this is relevant, but that's a niche scenario.
> The fix there is absolutely trivial: add padding.
Most common case is that this just wastes memory. Don't micro-optimizr before you know that this is actually a problem.
> Another often-trivial one is avoiding linked lists whenever feasible.
Again, not true, depending on your use case. If the operations that you commonly perform on the data structure, like inserting/deleting elements, then of course you should use a linked list or whatever container data structure your language provides. How caches play into this is at most a second order effect in the common case, unless you really want to optimize a tight loop in a performance critical application.
I've seen too many prematurely applied fancy data structures where it turned out that all this extra complexity was entirely unnecessary and just made things harder to maintain.