Hacker News new | ask | show | jobs
by jfim 2397 days ago
The loops should be inverted. It should be for y, then for x, which accesses the memory in a linear fashion instead of in stride.

This presentation (PDF) explains why: https://www.aristeia.com/TalkNotes/ACCU2011_CPUCaches.pdf