It is bad, it is required for performance reasons.
The questions is what could be the solution going forward, which is going to be a huge change anyway. I do not see a way out of this with our current architectures.
Neither block ciphers, nor stream ciphers, nor common public key algorithms (RSA, Ed25519) need or even profit from this. They just need fast access to the register-register math, maybe loop sequentially through all members of a fixed sized array a fixed number of times. The only thing those implementing such algorithms would probably like having is a few kiB of safe to access scratchpad memory for code and data. On entry to the crypto code copy the code and data there, enable a constant time mode for compute instructions and run the algorithm at full speed without worrying.