| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lukefleed 262 days ago
	Thanks! I used perf to look at cache miss rates and memory bandwidth during runs. The measurements showed the pattern I expected, but I didn't do a rigorous profiling study (different cache sizes, controlled benchmarks across architectures, or proper statistical analysis). This was for a university exam, and I ran out of time to do it properly. The cache argument makes intuitive sense (three vectors cycling vs. scanning a growing n×k matrix), and the timing data supports it, but I'd want to instrument it more carefully in the future :)