The real fun is optimising maths. Remove all divisions. Create LUTs, approximations, CPU specific tricks. Despite the fact CPUs are magnitudes faster now, they are still slow for real time processing.
If you have a buffer that's being clocked out and your goal is to keep data flowing, the jitter is going to influence how small your buffer can be. Let's say you're producing 56Khz audio, the best you can do is produce a [sample] exactly at that frequency. If you have 1ms jitter now you need a 1ms buffer so you have delay. If jitter is small enough, like 0.1ns jitter in some SIMD calculation, then for all intent and purpose it doesn't matter for an audio application...
You've just restated my point. If the deadlines are met, jitter doesn't matter. Ergo, you can't meet deadlines if your jitter is too large. Otherwise, it doesn't matter.
Wouldn't the deadline be now+zero for real time audio applications? If I'm building a guitar pedal (random example) ideally I want no delay from the input to the output. Any digital delay makes things strictly worse and so any jitter matters. That said, the difference between zero and very close to zero does become a moot point given small enough values for any practical purpose.
There are some digital audio systems that do sample-by-sample processing. Old school digidesign, for example.
But very little digital audio gear works that way these days. The buffer sizes may be small (e.g 8 or 16 samples), but most hardware uses block structured (buffer by buffer) processing.
If there are complex equations involved, it absolutely is faster. You can also create intermediate LUTs, so the tables are small and fit in cache and then do interpolation on the fly.
Yeah, isn’t hitting memory (especially if it can’t fit in L1-2 cache) one of the biggest sources of latency? Especially that on modern CPUs it is almost impossible to max out the arithmetic units, outside of microbenchmarks?
You don't really do these any more on a modern CPU. This is stuff I used to do 30 years ago and you might still do if you're on a micro-controller or some other tiny system. The CPUs aren't slow. Tne main problem is if the OS doesn't schedule your process it doesn't matter how fast the CPU is.
Thus such micro optimizations are seldomly used. Quite the opposite, you try to avoid jitter which could be the result of caches