|
|
|
|
|
by rjtobin
2172 days ago
|
|
The cartoon picture is that the first example will read everything into cache once, whereas the second example will read everything into cache twice. Cache lines are typically 64-bytes, so to write a single character to main memory the following things happen (again, a cartoon picture): First read the 64-bytes area that contains the byte of interest so that it is owned by my cache (this is called a RFO, "read-for-ownership"). Second, update the byte of interest. Thirdly (at some point) write the cache-line back to main memory. In the sequential case, we just read one 64-byte cache line at a time, update those 64 chars, then write the cache line back to main memory. In the second example, we first update all the even-indexed characters, which still forces us to read in every cache line. Then we loop around and do the odd-indexed characters, at which point we have to read the cache lines all over again (assuming the array is big enough that the whole thing can't fit in cache at once). |
|
1, 2, 4 3, 8 7 6 5, 16 15 14 13 12 11 10 9, ...
And so on?
---
Also this part:
>Cache lines are typically 64-bytes
Right, but I thought when you access an index it caches quite a lot more than 64 bytes from the index. Doesn't it throw a larger chunk of the array onto multiple lines? If that's the case then the first example is making very efficient use of the cache. If the modern CPUs are smart enough to cache backwards and I understand the second example, isn't the second too?