|
|
|
|
|
by conradev
3609 days ago
|
|
Is ChaCha20 actually implemented in hardware on any platforms? I was under the impression that the algorithm itself is just really really fast in software (especially so with SIMD). I implemented ChaCha20 in AArch64 assembly, and it was possible to encrypt/decrypt 6 blocks at once. |
|
https://cryptech.is/
ChaCha can efficiently be implemented in HW, esp in FPGAs that supports carry chains, which basically means most FPGAs.
It is somewhat hard do compare size and speed since both ChaCha and AES are so scaleable. In ChaCha there are many places where you can trade operator reuse with performance. But the fundamental operator size is 64-bits.
AES in comparison works on bytes and you can go from a single S-box (implemented as a table, as logic, as part of a T-box etc) that is reused in the datapath as well as key expansion all the way to a fully pipelined (10-14 rounds) humongous implementation. Very flexible and easy to adapt to the system requirements. One additional thing to note with AES is that for many cipher modes, the decryption functionality can be removed.
But with all this said. If I compare my implementation av AES (that includes decryption) with my implementation of ChaCha20, I get about 4x better performance with ChaCha with fairly close the same number of resources.
https://github.com/secworks/chacha https://github.com/secworks/aes
The ChaCha core requires more registers, esp for the API. This is due to the bigger block size (512 vs 128)
I like ChaCha in HW and thinks its a good choice. I'm currently working on a ChaCha20-Poly1305 core compatible with RFC7539 to make it easier for HW projects to use good AEAD ciphers.
https://tools.ietf.org/html/rfc7539