| The Cryptech project uses ChaCha as CSPRNG in our TRNG. We decided on ChaCha because of its performance and good security margin. I know of at least one more project that uses our ChaCha core. https://cryptech.is/ ChaCha can efficiently be implemented in HW, esp in FPGAs that supports carry chains, which basically means most FPGAs. It is somewhat hard do compare size and speed since both ChaCha and AES are so scaleable. In ChaCha there are many places where you can trade operator reuse with performance. But the fundamental operator size is 64-bits. AES in comparison works on bytes and you can go from a single S-box (implemented as a table, as logic, as part of a T-box etc) that is reused in the datapath as well as key expansion all the way to a fully pipelined (10-14 rounds) humongous implementation. Very flexible and easy to adapt to the system requirements. One additional thing to note with AES is that for many cipher modes, the decryption functionality can be removed. But with all this said. If I compare my implementation av AES (that includes decryption) with my implementation of ChaCha20, I get about 4x better performance with ChaCha with fairly close the same number of resources. https://github.com/secworks/chacha
https://github.com/secworks/aes The ChaCha core requires more registers, esp for the API. This is due to the bigger block size (512 vs 128) I like ChaCha in HW and thinks its a good choice. I'm currently working on a ChaCha20-Poly1305 core compatible with RFC7539 to make it easier for HW projects to use good AEAD ciphers. https://tools.ietf.org/html/rfc7539 |
For those wondering why this came up now, the third round CAESAR candidates will be announced any day now. DJB's choices in Salsa20/ChaCha are still looking very good.
The ability to do relatively effient masking/blinding in LRX algorithms is a major advantage at least, but with NORX you need 64-bit operations to get a 256-bit key which is frustrating. I wonder if NORX32-f could be used to make a Salsa20/ChaCha style stream cipher where you operate on block size data (say use the pseudo-addition to incorporate the start state).