Hacker News new | ask | show | jobs
by sliken 1508 days ago
Peak = never observed, but calculated from clock speed * bus width. Much like the speed of light, you'll never see it.

That number for the M1 Ultra (from the OP's post) = 800GB/sec. McCalpin's stream benchmark is often cited as a practical/useful number for usable bandwidth using a straight forward implementation in C or Fortran without trying to play games, much like the vast majority of codes out there.

Also note that the x86-64's in the world use a strict memory model that results in a lower fraction of observed bandwidth vs peak. Arm has a looser memory model which achieves a higher fraction of peak.