| > Setup cost is a thing, but A) is largely paid when you rekey Well, it depends on the crypto HW. Some HWs are designed for "throughput", which is completely useless for ECB but looks good on specs ("Our HW AES 10MB/s!"). So you set it up with src, dest and key pretty much as you setup your typical DMA transfer, only you almost never want to encrypt more than 16 bytes at a time with ECB so it's mostly wasted. > consumer device SoC (eg, all Qualcomm, Samsung, Apple, AMD, and Intel parts) We are not all so fortunate that we get to work with such powerful SoC. In my job it's mostly small embedded MPUs. > B) is acceptable in many protocols I think we are talking past each other here. I haven't even gotten to the protocol part yet. In order to support a wireless and/or network protocol you will need better building blocks than AES-ECB. You need AES-GCM (or at least AES-CCM). Not to mention ECDSA or RSA(>=3072)... |
Most accelerators come in one of a few flavors:
1/ They implement the expensive parts of a primitive for you and let you chain them together. This is how AES-NI and the ARMv8 crypto extensions work. Performance for these is generally measured in terms of cycle latency, or with a reference piece of software in cycles per byte. Common values for cycles per byte are anywhere from about 0.2 to 30. Much higher than that and people will start to go look at software as an option. You tend to see these on beefy systems with out-of-order cores.
2/ They implement a primitive for you, eg AES-ECB or SHA256, or more rarely AES-GCM and similar. These can then be chained together as with the above to build even higher level primitives like AES-CTR or AES-CCM, or they can be used as-is. These are usually found on micros as additional selling points, and therefore show up just above the bottom of most manufacturers' product lines as an upsell. These are typically measured in something like MB/s throughput, and I assume they're what you're focused on.
3/ They implement a full protocol, like TLS, CCMP, or secure boot. These show up on things that might more properly deserve the term SoC rather than microcontroller, largely because they tend to be attached to high-speed I/O. They generally aren't measured for cryptographic performance but rather for the performance of the implemented protocol.
In my mind, all three of these are using crypto accelerators. Taken together it is extremely common that a part will have one or more of these, and I'm not sure if we're still disagreeing on one or both of those points.
Regarding ECB, I don't know what you mean. Almost nobody uses ECB alone (thank goodness). Even if they have an accelerator for it, it's usually used to implement something like CTR with some software to glue it together (maybe with then yet more glue to do GCM). In that way, those accelerators act like a just-barely-higher-level version of the first type-- and if what you have is the first type of course you'll do that no matter what. This is still an accelerated implementation, it's just not 100% done in the accelerator. Of course, if you're doing that you're very often encrypting more than a block at a time. And because it's quite rare that you will be performance bottlenecked on a small infrequent operation in any context, you generally only do the work to turn on the accelerators when you care about that.
Regarding working on MCUs, I agree there's a minimum size past which you don't get crypto primitives, but overall don't think characterizing those parts as modern SoCs is terribly accurate (which was my claim).
Regarding needing better building blocks than ECB for a protocol... well, no, not necessarily. AES-NI doesn't even give you a full AES primitive, and yet it's extremely widely used.