Hacker News new | ask | show | jobs
by bananaface 2171 days ago
>So the first example has to read from main memory only every 64th index

Wouldn't it be 8 per cache line (an int is 64 bits, each cache line is 64 bytes)? I'm also assuming it caches a larger chunk of the array across multiple lines. Is that not how it works?

But I think there's a more fundamental issue here, which is that the amount measured, 68 million bytes in a second, is what - 60Mb? Did he just reduce the array size until it completed in a second? Because a very significant chunk of that is going to fit in L3 cache (on an i7 it's 8Mb), so even if you had a good random access algorithm, it would understate the problem because the data is still contiguous.

Which seems kinda dumb to me, since the real-word problem you're likely to run into is when your data is stored non-contiguously because it's scattered across multiple different structs/objects, making it impossible to utilise the cache to a significant degree at all. In that (very common under OO or interpreted languages) situation I'd expect a way more dramatic slowdown.