|
|
|
|
|
by oconnor663
1478 days ago
|
|
The part about data dependencies across loop iterations is fascinating to me, becuase it's mostly invisible even when you look at the generated assembly. There's a related optimization that comes up in implementations of ChaCha/BLAKE, where we permute columns around in a kind of weird order, because it breaks a data dependency for an operation that's about to happen: https://github.com/sneves/blake2-avx2/pull/4#issuecomment-50... |
|