Hacker News new | ask | show | jobs
by assholesRppl2 4452 days ago
The best cache-aware programming lesson I ever received was in the CS61C course at Berkeley -- building a cache-blocking algorithm to run a matrix multiplication function using the cache as efficiently as possible. We unrolled loops so that the size of each iteration was exactly the size of one cache block, and saw instantly the increase in FLOpS.

Then we did some OpenMP parallelization. That was cool.

Nice post!

2 comments

I had a similar project in a programming for performance class at UT Austin. We had access to TACC supercomputers and were tasked with finding the CPU's L1 cache size via trial and error. We were testing it by using matrix multiplication and measuring performance output, and changing the chunk sizes accordingly.
61C veteran here too! Those projects were mind blowing at times; unrolling a loop by 2 or even 4 would have ridiculous effects on runtime.