Hacker News new | ask | show | jobs
by dataangel 329 days ago
You're a bit naive about the complexity. Commonly longer sequences are actually faster, not just because instructions vary in their speed, but also because the presence of earlier instructions that don't feed results into later instructions still affect their performance. Different instructions consume different CPU resources and can contend for them (e.g. the CPU can stall even though all the inputs needed for a calculation are ready just because you've done too many of that operation recently). And then keep in mind when I say "earlier instructions" I don't mean earlier in the textual list, I mean in the history of instructions actually executed; you can reach the same instruction arriving from many different paths!
1 comments

Hmm, this usually doesn't come up simply because you're usually targeting multiple different CPU generations at once, and then the details cancel each other out.

The most ffmpeg has had to do in this area is that some CPUs had very slow unaligned memory loads and some didn't.