Hacker News new | ask | show | jobs
by fizixer 3957 days ago
In the first plot, why do all libraries slow down at the n=1000 mark? something to do with cache?
2 comments

Yep. A 32KB L1 cache can hold at most 4000 doubles. 4000/(2 input vectors + 1 result) =1333.

http://danluu.com/3c-conflict/

I'm guessing that's the point at which the working set exceeds the L1 cache size. You can see a few more subtle dips in the performance graph at later points; these correspond to working set spilling out of the L2 and L3 caches.