|
|
|
|
|
by deletes
4458 days ago
|
|
Comment on the blog that got deleted: I did a similar test in C and have gotten very similar results. When N is around 4000 the trashing version starts to differ substantially. A 3x difference can already be seen when N is 1000. This means if your program is running on two threads over different parts of the matrix, every single iteration requires a request to RAM. I'm skeptical over this part, I have tried to replicate this behavior but was unsuccessful. Even though cores are sharing L3, I doubt that a thread will overwrite the entire cache on every iteration. |
|
Either way, you should see a noticeable difference as the size increases, which was the point.